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(1) GENERAL INFORMATION: 

(i) APPLICANT: . Takanori OKURA 
Kakuji TORIGOE 
Masahi KURIMOTO 

(ii) TITLE OF INVENTION: GENOMIC DNA ENCODING A POLYPEPTIDE CAPABLE OF 

INDUCING THE PRODUCTION OF INTERFERON- y 

(iii) NUMBER OF SEQUENCES: 3 5 . 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: BROWDY AND NEIMARK 

(B) STREET: 419 Seventh Street, N.W., Suite 300 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20004 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible. 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patent In Release Sl.0, Version #1.30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 185,305/96 

(B) FILING DATE: 27-JUN-1996 - - 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: BROWDY, Roger L. 

(B) REGISTRATION NUMBER: 25,618 

(C) REFERENCE / DOCKET NUMBER: OKURA=l 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-628-5197 

(B) TELEFAX: 202-737-3528 

(2) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser Val He Arg Asn Leu Asn 

15 10 I 5 

Asp Gin Val Leu Phe He Asp Gin Gly Asn Arg Pro Leu Phe Glu Asp 

20 25 30 

Met Thr Asp Ser Asp Cys Arg Asp Asn Ala Pro Arg Thr He Phe He 

35 40 45 

He Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met Ala Val Thr He 

50 ' 55 60 . 

Ser Val Lys Cys Glu Lys He Ser Xaa Leu Ser Cys Glu Asn Lys He 
65 70 75 80 

He Ser Phe Lys Glu Met Asn Pro Pro Asp Asn He Lys Asp Thr Lys 

85 90 95 

Ser Asp He He Phe Phe Gin Arg Ser Val Pro Gly His Asp Asn Lys 

100 105 110 

Met Gin Phe Glu Ser Ser Ser Tyr Glu Gly Tyr Phe Leu Ala Cys Glu 
115 120 125 

■ ' 
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Lys Glu Arg Asp Leu Phe Lys Leu He Leu Lys Lys Glu Asp Glu Leu 

130 " 135 140 

Gly Asp Arg Ser He Met Phe Thr Val Gin Asn Glu Asp 
145 ~ 150 155 

(2) INFORMATION FOR SEQ ID NO: 2: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1120 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: No 

(iv) ANT I- SENSE: No 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
(F) TISSUE TYPE: liver 

(iX) FEATURE: 

(A) NAME /KEY : 5 ' UTR 

(B) LOCATION: 1..177 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 178.. 285 

(C) IDENTIFICATION METHODS: S 

(A) NAME/KEY: mat peptide * 

(B) LOCATION: 286.. 756 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : 3 ' UTR 

(B) LOCATION: 757.. 1120 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

GCCTGGACAG TCAGCAAGGA ATTGTCTCCC AGTGCATTTT GCCCTCCTGG CTGCCAACTC 60 
TGGCTGCTAA AGCGGCTGCC ACCTGCTGCA GTCTACACAG CTTCGGGAAG AGGAAAGGAA 120 
CCTCAGACCT TCCAGATCGC TTCCTCTCGC AACAAACTAT TTGTCGCAGG AATAAAG . 17 7 
ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA ATG 22 5 

Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala Met 

-35 -30 -25 

AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA GCT GAA GAT GAT GAA AAC 273 
Lys Phe He Asp Asn Thr Leu Tyr Phe He Ala Glu Asp Asp Glu Asn 
-20 -15 -10 -5 

CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA TCT AAA TTA TCA GTC ATA 321 
Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser Val He 

1 5 10 

AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT GAC CAA GGA AAT CGG CCT 36 9 

Arg Asn Leu Asn Asp Gin Val Leu Phe He Asp Gin Gly Asn Arg Pro 

15 20 25 

CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT AGA GAT AAT GCA CCC CGG 417 
Leu Phe Glu Asp Met Thr Asp Ser Asp Cys Arg Asp Asn Ala Pro Arg 

30 35 40 

ACC ATA TTT ATT ATA AGT ATG TAT AAA GAT AGC CAG CCT AGA GGT ATG 46 5 

Thr He Phe He He Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met 
45 50 55 60 

GCT GTA ACT ATC TCT GTG AAG TGT GAG AAA ATT TCA AYT CTC TCC TGT 513 
Ala Val Thr He Ser Val Lys Cys Glu Lys He Ser Xaa Leu Ser Cys 

65 70 75 

GAG AAC AAA ATT ATT TCC TTT AAG GAA ATG AAT CCT CCT GAT AAC ATC 561 
Glu Asn Lys lie He Ser Phe Lys Glu Met Asn Pro Pro Asp Asn He 

80 85 90 

AAG GAT AC A AAA AGT GAC ATC ATA TTC TTT CAG AGA AGT GTC CCA GGA 6 09 
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Lys Asp Thr Lys Ser Asp He He Phe Phe Gin Arg Ser Val Pro Gly 
95 ' 100 105 



CAT GAT AAT AAG ATG CAA TTT GAA TCT TCA TCA TAC GAA GGA TAC TTT 6 57 

His Asp Asn Lys Met Gin Phe Glu Ser Ser Ser Tyr Glu Gly Tyr Phe 

110 115 120 

CTA GCT TGT GAA AAA GAG AGA GAC CTT TTT AAA CTC ATT TTG AAA AAA 705 
Leu Ala Cys Glu Lys Glu Arg Asp Leu Phe Lys Leu He Leu Lys Lys 
125 130 " 135 140 

GAG GAT GAA TTG GGG GAT AGA TCT ATA ATG TTC ACT GTT CAA AAC GAA 7 53 
Glu Asp Glu Leu Gly Asp Arg Ser He Met Phe Thr Val Gin Asn Glu 

GAC TAGCTATTAA AATTTCATGC CGGGCGCAGT GGCTCACGCC TG.TAATCCCA 8 06 

Asp 

GCCCTTTGGG AGGCTGAGGC GGGCAGATCA CCAGAGGTCA GGTGTTCAAG ACCAGCCTGA 86 6 
CCAACATGGT GAAACCTCAT CTCTACTAAA AATACTAAAA ATTAGCTGAG TGTAGTGACG 926 
CATGCCCTCA ATCCCAGCTA CTCAAGAGGC TGAGGCAGGA GAATCACTTG C ACT C CGG AG 98 6 
GTAGAGGTTG TGGTGAGCCG AGATTGCACC ATTGCGCTCT AGCCTGGGCA AC AAC AG CAA 104 6 
AACTCCATCT CAAAAAATAA AATAAATAAA TAAACAAATA AAAAATT CAT AATGTGAAAA 1106 
AAAAAAAAAA AAAA 1120 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
_ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA * 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..135 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA TCT AAA TTA TCA 4 7 

Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser 

-5 15 10 

GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT GAC CAA GGA AAT 95 
Val lie Arg Asn Leu Asn Asp Gin Val Leu Phe lie Asp Gin Gly Asn 

15 20 25 

CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT AGA G 13 5 

Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp Cys Arg Asp 
30 35 40 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 
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(iX) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..134 

(C) IDENTIFICATION METHODS: S 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

AT AAT GCA CCC CGG ACC ATA TTT ATT ATA AGT ATG TAT AAA GAT- AGC 4 7 

Asp Asn Ala Pro Arg Thr lie Phe lie lie Ser Met Tyr Lys Asp Ser 
40 45 50 55 

CAG CCT AGA GGT ATG GCT GTA ACT ATC TCT GTG AAG TGT GAG AAA ATT 95 
Gin Pro Arg Gly Met Ala Val Thr lie Ser Val Lys Cys Glu Lys He 

60 65 70 

TCA ACT CTC TCC TGT GAG AAC AAA ATT ATT TCC TTT AAG 134 
Ser Thr Leu Ser Cys Glu Asn Lys lie lie Ser Phe Lys 
80 85 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : human * 

(F) TISSUE TYPE: placenta * 

(iX) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1. .87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GAATAAAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val 
-35 -30 -25 

GCA ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G 
Ala Met Lys Phe lie Asp Asn Thr Leu Tyr Phe lie Ala 
-20 -15 -10 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 



(iX) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
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CT GAA GAT GAT G 
Ala Glu Asp Asp Glu 
-10 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : exon + 3 1 UTR 

(B) LOCATION: 1..2167 

(C) IDENTIFICATION METHODS: E 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 



GAA 


ATG 


AAT 


CCT 


CCT 


GAT 


AAC 


ATC 


AAG 


GAT 


ACA 


AAA 


AGT GAC 


ATC 


ATA 


48 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn 


He 


Lys 


Asp 


Thr 


Lys 


Ser Asp 


He 


He 




85 










90 










95 








100 




TTC 


TTT 


CAG 


AGA 


AGT 


GTC 


CCA 


GGA 


CAT 


GAT 


AAT 


AAG 


ATG CAA 


TTT 


GAA 


96 


Phe 


Phe 


Gin 


Arg 


Ser 


Val 


Pro 


Gly 


His 


Asp 


Asn 


Lys 


Met Gin 


Phe 


Glu 










105 










110 








115 






TCT 


TCA 


TCA 


TAC 


GAA 


GGA 


TAC 


TTT 


CTA 


GCT 


TGT 


GAA 


AAA GAG 


AGA 


GAC 


144 


Ser 


Ser 


Ser 


Tyr 
120 


Glu 


Gly 


Tyr 


Phe 


Leu 
125 


Ala 


Cys 


Glu 


Lys Glu 
130 


Arg 


Asp 




CTT 


TTT 


AAA 


CTC 


ATT 


TTG 


AAA 


AAA 


GAG 


GAT 


GAA. 


TTG 


GGG GAT 


AGA 


TCT 


192 


Leu 


Phe 


Lys 


Leu 


He 


Leu 


Lys 


Lys 


Glu 


Asp 


Glu 


Leu Gly Asp Arg 


Ser 








135 










140 










145 








ATA 


ATG 


TTC 


ACT 


GTT 


CAA 


AAC 


GAA. 


GAC 


TAGCTAT 


TAAAATTTCA ' 


TGCCGGGCGC 


246 


He 


Met 
150 


Phe 


Thr 


Val 


Gin 


Asn 
155 


Glu 


Asp 

















AGTGGCTCAC GCCTGTAATC CCAGCCCTTT GGGAGGCTGA GGCGGGCAGA TCACCAGAGG 3 06 
TCAGGTGTTC AAGACCAGCC TGACCAACAT GGTGAAACCT CATCTCTACT AAAAATACAA 366 
AAAATTAGCT GAGTGTAGTG ACCCATGCCC TCAATCCCAG CTACTCAAGA GGCTGAGGCA 4 26 
GGAGAATCAC TTGCACTCCG GAGGTGGAGG TTGTGGTGAG CCGAGATTGC ACCATTGCGC 4 86 
TCTAGCCTGG GCAACAACAG CAAAACTCCA TCTCAAAAAA TAAAATAAAT AAATAAACAA 546 
ATAAAAAATT CATAATGTGA ACTGTCTGAA TTTTTATGTT TAGAAAGATT ATGAGATTAT 6 06 
TAGTCTATAA TTGTAATGGT GAAATAAAAT AAATACCAGT CTTGAAAAAC ATCATTAAGA 666 
AATGAATGAA CTTTCACAAA AGCAAACAAA CAGACTTTCC CTTATTTAAG TGAATAAAAT 726 
AAAATAAAAT AAAATAATGT TTAAAAAATT CATAGTTTGA AAACATTCTA CATTGTTAAT 786 
TGGCATATTA ATTATACTTA ATATAATTAT TTTTAAATCT TTTGGGTTAT T AGT CCT AAT 846 
GACAAAAGAT ATTGATATTT GAACTTTCTA ATTTTTAAGA ATATCGTTAA ACCATCAATA 906 
TTTTTATAAG GAGGCCACTT CACTTGACAA ATTTCTGAAT TTCCTCCAAA GTCAGTATAT 966 
TTTTAAAATT CAGTTTGATC CTGAATCCAG CAATATATAA AAGGGATTAT ATACTCTGGC 102 6 
CAACTGACAT TCATCCTAGG AATGCAAAGA TGGTTTAATA TCCTAAAATC AATTAACATA 108 6 
ACATACTATA TTAATAAAGT AT C AAAAC AG TATTCTCATC TTTTTTTCTT TTTTCACAAT 1146 
TCCTTGGTTA CAC TAT CATC T C AAT AG ATG CAGAAAAAGC ATTTGACAAA ATCCAATTCA 12 06 
TAATAAAAAT TCTCAAACTT GAAAGAGAAC ATCATAAAGG CATC T ATG AA AAACCTACAG 1266 
C T AAT AT CAT ACTTAACGAT GAAAAACTGA ATTATTTTAC CCTAAGATCA AGAATAATGC 1326 
AAGCATGTCA GCTCTTGCAA CTTCTATTCA ACATTGTACT GGAGGTTCTA GCCAGAGCAA 13 86 
CCATACAATA AATAAAAATA AAAGGCACCC AGATTAGAAA GGAAGTCTTT ATTTGCAGAC 1446 
AACATGGTTC TTTATG CAG A AAACCGTCAG GAATACACAC ACATGTTAGA ACTAATAAGT 1506 
TCAGCAAGGT TGCAGGTTGC AATATCAATA TGCAAAAATA CATTGAAGGC TGGGCTCAGT 1566 
GGAGATGGCA TGTACCTTTC GTCCCAGCTA CTTGGGAGGC TGAGGTAGGA GGATCACTTG 1626 
AGGTGAGGAG TTTGAGGCTA TAGTGCAATG TGATCTTGCC TGTG AAT AG C CACTGCACTC 1686 
GAGCCTAGGC AACAAAGTGA GACCCCGTCT CCAAAAAAAA AAATGGTATA TTGGTATTTC 174 6 
TGTATATGAA CAATGAATGA TCTGAAAACA AGAAAATTCC ATTCACGATG GTATTAAAAA 1806 
AATAAAATAC AAATAAATTT AG C AAAAT AA TTATAAAACT TGTACATCGA AAATTTCAAA 1866 
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GCACTCTGAG GGAAATTAAA GATGATCTAA ATAATTGGAG AGACACTCTA TGATCACTGA 192 6 
TTGGAAAATT CATTCAATAT TGTTAAGATA ACAATTGTCC CCAAATTGAT GCATGCATTC 198 6 
AATTTAGTCT TCATCAAAAT TCCAGCAGGG T TTTTGC AG A AATTGACAAG CTGTACCCAA 204 6' 
AATGTATATG GAAATGAAAA GACCCAGAAG AGCAAATAAT TTTTTAAAAA CAAAGTTGGA 2106 
AAACTTTTAC TTCCTAATTT TAAAACTTAC TATAAACCTA AAGTTATCAA GACCATTTAG 216 6 
T , 2167 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : intron 

(B) LOCATION: 1..1334 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 

GTATTTTTTT TAATTCGCAA ACATAGAAAT GACTAGCTAC TTCTTCCCAT TCTGTTTTAC 60 

TGCTTACATT GTTCCGTGCT AGTCCCAATC CTCAGATGAA AAGTCACAGG AGTGACAATA 120 

ATTTCACTTA CAGGAAACTT TATAAGGCAT CCACGTTTTT TAGTTGGGGT AAAAAATTGG 18 0 

ATACAATAAG ACATTGCTAG GGGTCATGCC TCTCTGAGCC TGCCTTTGAA TCACCAATCC 24 0 

CTTTATTGTG ATTGCATTAA CTGTTTAAAA CCTCTATAGT TGGATGCTTA ATCCCTGCTT 300 

GTTACAGCTG AAAATGCTGA T AGTTT AC C A GGTGTGGTGG CATCTATCTG TAATCCTAGC 360 

TACTTGGGAG GCTCAAGCAG GAGGATTGCT TGAGGCCAGG ACTTTGAGGC TGTAGTACAC 42 0 

TGTGATCGTA CCTGTGAATA GCCACTGCAC TCCAGCCTGG GTGATATACA GACCTTGTCT 480 

CTAAAATTAA AAAAAAAAAA AAAAAAAACC TTAGGAAAGG AAATTGATCA AGTCTACTGT 54 0 

GCCTTCCAAA ACATGAATTC CAAATATCAA AGTTAGGCTG AGTTGAAGCA GTGAATGTGC 600 

ATTCTTTAAA AATACTGAAT ACT-TACCTTA ACATATATTT TAAATATTTT ATTTAGCATT 66 0 

TAAAAGTTAA AAACAATCTT TTAGAATTCA TATCTTTAAA ATACTCAAAA AAGTTGCAGC 72 0 

GTGTGTGTTG TAATACACAT TAAACTGTGG GGTTGTTTGT TTGTTTGAGA TGCAGTTTCA 78 0 

CTCTGTCACC CAGGCTGAAG TGCAGTGCAG TGCAGTGGTG TGATCTCGGC TCACTACAAC 84 0 

CTCCACCTCC CACGTTCAAG CGATTCTCAT GCCTCAGTCT CCCGAGTAGG TGGGATTACA 900 

GGCATGCACC ACTTACACCC GGCTAATTTT TGTATTTTTA GTAGAGCTGG GGTTTCACCA 960 

TGTTGGCCAG GCTGGTCTCA AACCCCTAAC CTCAAGTGAT CTGCCTGCCT CAGCCTCCCA 102 0 

AACAAACAAA CAACCCCACA GTTTAATATG TGTTACAACA CACATGCTGC AACTTTTATG 1080 

AGTATTTTAA TGATATAGAT TATAAAAGGT TGTTTTTAAC TTTTAAATGC TGGGATTACA 1140 

GGCATGAGCC ACTGTGCCAG GCCTGAACTG TGTTTTTAAA AATGTCTGAC CAGCTGTACA 12 00 

TAGTCTCCTG CAGACTGGCC AAGTCTCAAA GTGGGAACAG GTGTATTAAG GACTATCCTT 126 0 

TGGTTAAATT TCCGCAAATG TTCCTGTGCA AGAATTCTTC TAACTAGAGT TCTCATTTAT 132 0 

TATATTTATT TCAG 1334 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4773 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 
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(A) NAME /KEY : intron 

(B) LOCATION: 1..4773 

CO IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GTAAGACTGA GCCTTACTTT GTTTTCAATC ATGTTAATAT AATCAATATA ATTAGAAATA 60 
TAACATTATT TCTAATGTTA ATATAAGTAA TGTAATTAGA AAACTCAAAT ATCCTCAGAC 120 
CAACCTTTTG TCTAGAACAG AAATAACAAG AAGCAGAGAA CCATTAAAGT GAATACTTAC 180 
TAAAAATTAT CAAACTCTTT ACCTATTGTG ATAATGATGG TTTTTCTGAG CCTGTCACAG ■ 24 0 
GGGAAGAGGA GATACAACAC TTGTTTTATG ACCTGCATCT CCTGAACAAT CAGTCTTTAT 3 00 
ACAAATAATA ATGTAGAATA CATATGTGAG TTATACATTT AAGAATAACA TGTGACTTTC 360 
CAGAATGAGT TCTGCTATGA AGAATGAAGC TAATTATCCT TCT.ATATTTC TACACCTTTG 420 
TAAATTATGA TAATATTTTA ATCCCTAGTT GTTTTGTTGC TGATCCTTAG CCTAAGTCTT 480 
AGACACAAGC TTCAGCTTCC AGTTGATGTA TGTTATTTTT AATGTTAATC TAATTGAATA 54 0 
AAAGTTATGA GATCAGCTGT AAAAGTAATG CTATAATTAT CTTCAAGCCA GGTATAAAGT 6 00 
ATTTCTGGCC TCTACTTTTT CTCTATTATT CTCCATTATT ATTCTCTATT ATTTTTCTCT 660 
ATTTCCTCCA TTATTGTTAG ATAAACCACA ATTAACTATA GCTACAGACT GAGCCAGTAA 720 
GAGTAGCCAG GGATGCTTAC AAATTGGCAA TGCTTCAGAG GAGAATTCCA TGTCATGAAG 78 0 
ACTCTTTTTG AGTGGAGATT TGCCAATAAA TATCCGCTTT CATGCCCACC CAGTCCCCAC 84 0 
TGAAAGACAG TTAGGATATG ACCTTAGTGA AGGTACCAAG GGGCAACTTG GTAGGGAGAA 9 00 
AAAAGCCACT CTAAAATATA ATCCAAGTAA GAACAGTGCA TATGCAACAG ATACAGCCCC 960 
CAGACAAATC CCTCAGCTAT CTCCCTCCAA CCAGAGTGCC ACCCCTTCAG GTGACAATTT 1020 
GGAGTCCCCA TTCTAGACCT GACAGGCAGC TTAGTTATCA AAATAGCATA AGAGGCCTGG 1080 
GATGGAAGGG TAGGGTGGAA AGGGTTAAGC ATGCTGTTAC TGAACAACAT AATTAGAAGG 114 0 
GAAGGAGATG GCCAAGCTCA AGCTATGTGG GATAGAGGAA AACTCAGCTG CAGAGGCAGA 1200 
TTCAGAAACT GGGATAAGTC CGAACCTACA GGTGGATTCT TGTTGAGGGA GACTGGTGAA 126 0 
AATGTTAAGA AGATGGAAAT AATGCTTGGC ACTTAGTAGG AACTGGGCAA ATCCATATTT 132 0 
GGGGGAGCCT GAAGTTTATT CAATTTTGAT GGCCCTTTTA AATAAAAAGA ATGTGGCTGG 1380 
GCGTGGTGGC TCACACCTGT AATCCCAGCA CTTTGGGAGG CCGAGGGGGG CGGATCACCT 144 0 
GAAGTCAGGA GTTCAAGACC AGCCTGACCA ACATGGAGAA AC.CCCATCTC TACTAAAAAT 1500 
ACAAAATTAG CTGGGCGTGG TGGCATATGC CTGTAATCCC AGCTACTCGG GAGGCTGAGG 1560 
CAGGAGAATC TTTTGAACCC GGGAGGCAGA GGTTGCGATG AGCCTAGATC GTGCCATTGC 1620 
ACTCCAGCCT GGGCAACAAG AGCAAAACTC GGTCTCAAAA AAAAAAAAAA AAAAGTGAAA 1680 
TTAACCAAAG GCATTAGCTT AATAATTTAA TACTGTTTTT AAGTAGGGCG GGGGGTGGCT 1740 
GGAAGAGATC TGTGTAAATG AGGGAATCTG ACATTTAAGC TTCATCAGCA TCATAGCAAA 18 00 
TCTGCTTCTG GAAGGAACTC AATAAATATT AGTTGGAGGG GGGGAGAGAG TGAGGGGTGG 1860 
ACTAGGACCA GTTTTAGCCC TTGTCTTTAA TCCCTTTTCC TGCCACTAAT AAGGATCTTA 1920 
GCAGTGGTTA TAAAAGTGGC CTAGGTTCTA GATAATAAGA TACAACAGGC CAGGCACAGT 1980 
GGCTCATGCC TATAATCCCA GCACTTTGGG AGGGCAAGGC GAGTGTCTCA CTTGAGATCA 2 04 0 
GGAGTTCAAG ACCAGCCTGG CCAGCATGGC GATACTCTGT CTCTACTAAA AAAAATACAA 210 0 
AAATTAGCCA GGCATGGTGG CATGCACCTG TAATCCCAGC TACTCGTGAG CCTGAGGCAG 2160 
AAGAATCGCT TGAAACCAGG AGGTGTAGGC TGCAGTGAGC TGAGATCGCA CCACTGCACT 2220 
CCAGCCTGGG CGACAGAATG AGACTTTGTC TCAAAAAAAG AAAAAGATAC AACAGGCTAC 2280 
CCTTATGTGC TCACCTTTCA CTGTTGATTA CTAGCTATAA AGTCCTATAA AGTTCTTTGG 2 340 
TCAAGAACCT TGACAACACT AAGAGGGATT TGCTTTGAGA GGTTACTGTC AGAGTCTGTT 24 00 
T CAT AT AT AT ACATATACAT GTATATATGT ATCTATATCC AGGCTTGGCC AGGGTTCCCT 24 60 
CAGACTTTCC AGTGCACTTG GGAGATGTTA GGTCAATATC AACTTTCCCT GGATTCAGAT 2 520 
TCAACCCCTT CTGATGTAAA AAAAAAAAAA AAAAAGAAAG AAATCCCTTT C CCCTTGG AG 2 580 
CACTCAAGTT TCACCAGGTG GGGCTTTCCA AGTTGGGGGT TCTCCAAGGT CATTGGGATT 2 640. 
GCTTTCACAT CCATTTGCTA TGTACCTTCC CTATGATGGC TGGGAGTGGT CAACATCAAA 2 7 00 
ACTAGGAAAG CTACTGCCCA AGGATGTCCT TACCTCTATT CTGAAATGTG CAATAAGTGT 2 760 
GATTAAAGAG ATTGCCTGTT CTACCTATCC ACACTCTCGC TTTCAACTGT AACTTTCTTT 2 82 0 
TTTTCTTTTT TTCTTTTTTT CTTTTTTTTT GAAACGGAGT CTCGCTCTGT CGCCCAGGCT 2 8 80 
AGAGTGCAGT GGCACGATCT CAGCTCACTG CAAGCTCTGC CTCCCGGGTT CACGCCATTC 2 94 0 
TCCTGCCTCA CCCTCCCAAG CAGCTGGGAC TACAGGCGCC TGCCACCATG CCCAGCTAAT 3 000 
TTTTTGTATT TTTAGTAGAG ACGGGGTTTC ACCGTGTTAG CCAGGATGGT CTCGATCTCC 3 060 
TGAACTTGTG ATCCGCCCGC CTCAGCCTCC CAAAGTGCTG G G ATT AC AGG CGTGAGCCAT 3120. 
CGCACCCGGC TCAACTGTAA CTTTCTATAC TGGTTCATCT TCCCCTGTAA TGTTACTAGA 3180 
GCTTTTGAAG TTTTGGCTAT GGATTATTTC TCATTTATAC ATTAGATTTC AGATTAGTTC 324 0 
CAAATTGATG CCCACAGCTT AGGGTCTCTT CCTAAATTGT ATATTGTAGA CAGCTGCAGA 3300 
AGTGGGTGCC AATAGGGGAA CTAGTTTATA CTTTCATCAA CTTAGGACCC ACACTTGTTG 3360 
ATAAAGAACA AAGGTCAAGA GTTATGACTA CTGATTCCAC AACTGATTGA GAAGTTGGAG 3420 
ATAACCCCGT GACCTCTGCC ATCCAGAGTC TTTCAGGCAT CTTTGAAGGA TGAAGAAATG 34 8 0 
CTATTTTAAT TTTGGAGGTT TCTCTATCAG TGCTTAGGAT CATGGGAATC TGTGCTGCCA 3 54 0 
TGAGGCCAAA ATTAAGTCCA AAACATCTAC TGGTTCCAGG ATTAACATGG AAGAACCTTA 3600 
GGTGGTGCCC ACATGTTCTG ATCCATCCTG CAAAATAGAC ATGCTGCACT AACAGGAAAA 366 0 
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GTGCAGGCAG CACTACCAGT TGGATAACCT GCAAGATTAT AGTTTCAAGT AATCTAACCA 3 7 20 
TTTCTCACAA GGCCCTATTC TGTGACTGAA ACATACAAGA ATCTGCATTT GGCCTTCTAA 3 780 
GGCAGGGCCC AGCCAAGGAG ACCATATTCA GGACAGAAAT TCAAGACTAC TATGGAACTG 3 84 0 
GAGTGCTTGG CAGGGAAGAC AGAGTCAAGG ACTGCCAACT GAGCCAATAC AGCAGGCTTA 3 900 
CACAGGAACC CAGGGCCTAG CCCTACAACA ATTATTGGGT CTATTCACTG TAAGTTTTAA 3 960 
TTTCAGGCTC CACTGAAAGA GTAAGCTAAG ATTCCTGGCA CTTTCTGTCT CTCTCACAGT 4 020 
TGGCTCAGAA ATGAGAACTG GTCAGGCCAG GCATGGTGGC TTACACCTGG AATCCCAGCA 4 080 
CTTTGGGAGG CCGAAGTGGG AGGGTCACTT GAGGCCAGGA GTTCAGGACC AGCTTAGGCA 414 0 
ACAAAGTGAG ATACCCCCTG ACCCCTTCTC TACAAAAATA AATTTTAAAA ATTAGCCAAA 4200 
TGTGGTGGTG TATACTTACA GTCCCAGCTA CTCAGGAGGC TGAGGCAGGG GGATTGCTTG 4260 
AGCCCAGGAA TTCAAGGCTG CAGTGAGCTA TGATTTCACC ACTGCACTTC TGGCTGGGCA 4320 
AC AG AG CG AG ACCCTGTCTC AAAGGAAAAA GAAAAAGAAA CTAGAACTAG CCTAAGTTTG 43 80 
TGGGAGGAGG TCATCATCGT CTTTAGCCGT GAATGGTTAT TATAGAGGAC AGAAATTGAC 4440 
ATTAGCCCAA AAAGCTTGTG GTCTTTGCTG GAACTCTACT TAATCTTGAG CAAATGTGGA 4 500 
CAGCACTCAA TGGGAGAGGA GAGAAGTAAG CTGTTTGATG TATAGGGGAA AACTAGAGGC 4 560 
CTGGAACTGA ATATGCATCC CATGACAGGG AGAATAGGAG ATTCGGAGTT AAGAAGGAGA 4 62 0 
GGAGGTCAGT ACTGCTGTTC AGAGATTTTT TTTATGTAAC TCTTGAGAAG CAAAACTACT 4 68 0 
TTTGTTCTGT TTGGTAATAT ACTTCAAAAC AAACTTCATA TATTCAAATT GTTCATGTCC 4740 
TGAAATAATT AGGTAATGTT TTTTTCTCTA TAG 4 7 73 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8835 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: > 
(A) ORGANISM: human 
(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : intron 

(B) LOCATION: 1..8835 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GTAAGAAATA TCATTCCTCT TTATTTGGAA AGTCAGCCAT GGCAATTAGA GGTAAATAAG 60 

CTAGAAAGCA ATTGAGAGGA ATATAAACCA TCTAGCATCA CTACGATGAG CAGTCAGTAT 12 0 

CAACATAAGA AATATAAGCA AAGTCAGAGT AGAATTTTTT TCTTTTATCA GATATGGGAG 180 

AGTATCACTT TAGAGGAGAG GTTCTCAAAC TTTTTGCTCT CATGTTCCCT TTACACTAAG 24 0 

CACATCACAT GTTAGCATAA GTAACATTTT TAATTAAAAA TAACTATGTA CTTTTTTAAC 3 00 

AACAAAAAAA AG C AT AAAG A GTGACACTTT TTTATTTTTA CAAGTGTTTT AACTGGTTTA 360 

ATAGAAGCCA TATAGATCTG CTGGATTCTC ATCTGCTTTG CATTCAGACT ACTGCAATAT 420 

TGCACAGAAT GCAGCCTCTG GTAAACTCTG TTGTACACTC ATGAGAGAAT GGGTGAAAAA 480 

GACAAATTAC GTCTTAGAAT TATTAGAAAT AGCTTTCACT TTAGGAACTC CCTGAGAATT 540 

GCTGCTTTAG AGTGGTAAGA TAAATAAGCT TCTCTTTAAA CGGAATCTCA AGACAGAATC 6 00 

AGTTACATTA AAAGCAAACA AAAAATTTGC CCATGGTTAG TCATCTTGTG AAATCTGCCA 660 

CACCTTTGGA CTGGGCTACA ATTGGATAAT ATAGCATTCC CCGAGATAAT TTTCTCTCAC 72 0 

AATTAAGGAA AGGGCTGAAT AAATATCTCT GTTTGAAGTT G AAT AAC AAA AATTAGGACC 780 

CCCTAAATTT TAGGGCTCCT GAAATTCGTC TTTTTGCCTA TATTCAGCTA CTTTACGTTC 84 0 

TATTAAATCT TCTTTCAGGC CAGGTGCACT AGCTCATGCC TAGAATCTCA GGCAGGCCTG 900 

AGCCCAGGAA TTTGAGACCA GCCAGGGCAA CACAGTCTCT ACAAAAAAAT AAAAAATTAC 960 

CTGGGTGTGT TGGTGCATGC CTGTAGAACT ACTCAGGATG CTGAGGACTG CTTGAGCCCA 102 0 

GGATAGCCAA ATCTGTGGTG AGTTCAGCCA C T AAAC AG AG CGAGACTTTC TCAAAAAAAC 108 0 

AAACAAAAAA ACAAACAAAC TTCCTTCAAA ATAACTTTTT ATCTGCAATG TTTTCCTATT 114 0 

GCCTGTGAGA TTAAATTTAC TCTTTTACCT GATTTCCAAA GCCCTCCATA ATCTAATCCG 1200 

ACTTTACCTT GTGTTCACTG C AAAAT AG C A GGACTGTTCC ACT AC AAT C C AAAAATCACA 1260 

GGTTGGGTGC AGTGGCTCAC TCCTGTAATC CCAACACTTT GGAAGGCCAA GGCAGGTGGA 1320 

TTGCTTCAGC TCAGGAGTTC AAGACCAGCC TGGGCAACAT GGCAAAAACC CTGTCTCTCC 1380 

AAAACATACA AAAATT AG C C AGATGTGGTA GTATGTGCCT GTAGTCCCAA CTACTCAAAA 144 0 

GGCTAAGGCA AGAGGATCAC TTGAGCCCAG GAGGTCAAGG CTACAGTGAG CCATGTTTAC 1500 
TGTGTCACTG CACTCCAGCC TGGGTGATAG AG C AAG AC C A TGTCTCAAAA AAAAAAAAAA 1560 
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GAAAAGAAAA GAAAAAAACA TCGCTCTATT CAGTTCACCC CCACCACAAC ATTGTTTTGA 16 2 0 
TTATCACATA AATGCTGGTC CATTGCCTTC TCTATCTATT CAAATCTTTA AGCATTCTTT 16 8 0 
GAGATTCAAC TCAATTCTCC TTTTCAAACT AGGCCATTTA AACTACATCA GTTCCATTTT 17 4 0 
GATTTTCTTG CTTTGAGTCT ACAGACTCAA AAACAAAAAC TTAAAAACTT ATTTTTTAAG 18 00 
TTTTCTGCTA CTCTCACTTC TTCAACACTC ACATACACGC ATTCATAATA AGATGGCAGA 18 60 
ATGTTCAAGG ATAAAATGAT TTATAGAACT G AAAAGTT AG GTTTTGATCT TGTTGCTGTC 1920 
AAGATGACTA CCTACCTGAT CTCAGGTAAT TAATTATGTA GCATGCTCCC TCATTTCATC 19 80 
CCATACCTAT TCAACAGGAT TGGAATTCCA CAGCAAGGAT AAACATAATC ATAGTTGCTT 2 04 0 
TTCAAGTTCA AGGCATTTTA ACTTTTAATC TAGTAGTATG TTTGTTGTTG TTGTTGTTGT 2100 
TTGAGATGGA GCCCTGCTGT GTCACCCAGG CTGGAGTGCA GTGGCACGAA CTCGGCTCAC 2160 
TGCAACCTCT GCCTCATGGG TTCAATCAGT TATTCTGCCT CAGTGTCCCA AGTAGCTGGG 222 0 
ACTACAAGGC ACATGCCACC ATGCCTGGCT AATTTTTGTA TTTTTAGTAG AAACAGGGCT 22 8 0 
TCACCATGTT GGCCAGGCTG GTCTCGAACT CCTGACCTCA AGTGATCCAG CCGCCTCGGC 2 34 0 
CTCCCAAAGT GCTGGGATTA CAGGCATAAG CCACCGTGCC CAGCCTAATA GTATGTTTTT 24 0 0 
AAACTCTTAG TGGCTTAACA ATGCTGGTTG TATAATAAAT ATGCCATAAA TATTTACTGT 24 60 
CTTAGAATTA TGAAGAAGTG GTTACTAGGC CGTTTGCCAC ATATCAATGG TTCTCTCCTT 2 520 
ACAGCTTTAA TTAGAGTCTA GAATTGCAGG TTGGTAGAGC TGGAACAGAC CTTAAAGATT 2 580 
GACTAGCCAA CTTCCTTGTC CAAATGAGGG AACTGAGACC CTTAAAATTA AGTGACTTGC 2640 
CCCAGACAAA ACTGGAACTC ATGTGTCCTA ATTTCCATCA TGAAATTCTA CCATTCACTA 2 7 00 
GCCTCTGGCT AGTTGT C AAA GTATTGCATA ACTAAATTTT TATGTCTGTT TTAAAGAACA 2 76 0 
AATTGTCACT GCTTACTCCT GGGAGGGTCT TTCTGAGGTG GTTTATAACT CTTAAAAAAA 2 820 
AAAAAGTCAG TAGTCTGAGA ATTTTAGACG AAATAGTCAA AGCATTTTTA TCCAATGGAT 2 8 80 
CTATAATTTT CAT AG ATT AG AGTTAAATCA AAGAAACACG GATGAGAAAG GAAGAGGAAA 2 94 0 
ATTGAGGAGA GGAGGAATGG GGATGAGAAC ACACTACTTG TAATCAGTCA TAGATGTACT 3 000 
GAGAACTAAC AAGAAGAATT GTAAGAAAAT AAGAATGAAG AATTCAAAAT CAACACATGA 3 0 60 
AATAAAAAGA AACTACTAGG GAAAAATGGA GAAGACATTA GAAAAATTAT TCTATTTTTA 3120 
AAATTCTGTT TTCAGGCTTC CCTCCTGTTC TTCCTCCTTC TCATTGGTTT TCAGGTGGAG 318 0 
GGAAAGTTTA AGATGGAAAA AATATATATA TTCTACACAT CCCTTTCTAC GCTGTTGTCA 3 24 0 
TGGCAACAAG GTTTATCATA GCAAACTTTT ATTCATACAA CATTTATTGA GTTCTTACTG 3 3 00 
TGTGGTAAGC TCTTTCCAGG TGTTGAAAAT TCAGGGGAAA AAAGACAACT CATTGTCTTA 33 6 0 
AAACTCAGAT GAAAGCTGAA CAGACCTATT TTTAATCAAA GTAATCTCAA TTTAGGGTAG 3 42 0 
TAAGAGCTAT TTAAGAAGCA TGAACAGGTG TGAAGGAGGT AGGACTCTGA GGAGAGAATA 34 8 0 
GTTAGCTAGG AATGAAAGAG CAGAGAAGTT TTCCTAGAGG AAC T ATT AAA GCTGGGAGTT 3 540 
ACGGGATGAA AGATGAGGCA GGGTTTGCAG GCAAAAAAAA AAAAAAGGCA GGGGAAGGGG 3 600 
AAGTTCTGGC C TGG C AG AG A GAATAACTGT GGCAACAATG GAGGAGAGTC TGGAAGCAAG 36 6 0 
AAAACCAAGT AGAAGAGTAT TAAAATAGAA GATGCCAGGG GTAATGAGGG CTTGATTTAA 3 72 0 
AACAGTGCTG TTGGAGATGG AGAGGAGATA CCAAATTCTG GAGACATTTC TGAGTTAGAA 3 780 
CCTACAGTAT TTATCAGACA AGGGAAAGAT TAGACAAAGG AGTTAAGAAT GACTCCCAGG 3 84 0 
TTTCAGTTTG GGGCAGGTAA CTAGGACATG TTTTGAAAAG TAATGTATTG GATCTCTTAC 3 9 00 
CATTGGAACT ATGTATGTGG AGCCAAATTA AAATTTGTAC ATGTATATAA CTCTCCCCCC 3 96 0 
ACCACCAGTA ACTACTTCCC TAACTCTCTA CTTTGTAGCC AGACTTCCTA AAAGAATAGT 4 02 0 
TTGTAGTCAC TGTCTTTACT TTTCCCCTCC CATTCTGTCC TAGATATTTG TCCACCTACC 4 080 
ATCTGCTGCC TCCACTTTAC CCAAACTGTT CTACGGTTGC CCAAAACTTC CTAATTGCCA 414 0 
AATTCAATGA ACAAGTTTAA GCTTATATGT AAATTAGGAG CTCTACAGTT TGATTTCGAG 4200 
CAGCCCCTCC TGAAACCCTT TCTCTTTCGA CTTCTGTGAC ACATCTCAGA TTTACAAAAC 4260 
TGAACTAATT ATTTTACACT TGAGCTGTAT TTTCGTTCTT CTTTCTTGAT GAATGAGGTA 4320 
ACCACTCAAC AAATTGCCCA AGCCAAAAAC TACGAAGTCA TCCTCAGTTC CTCCTTCTTC 43 80 
TGTTTGACCC AC AAC AG AT C AGCTGAGAAA TCCCGCTGTT TAGTATCTCT TGAATTCATT 444 0 
ACCTTAATTT ATAGCCTCAT CAACTCTTAA TTGTTAAAAT TACTTCAGTA GTTGTTGTCT 4 500 
GACCTCTGTC CAATCTTGTT CAATCAGGTC CATTCTTTTG TTCTTGGTGG TGGTGGTGGT 4560 
GTTGACAGAG TTTCGCTTTT GCTGCCCAGG CTGAAGTGCA GTGGAGCACT T C AC TG C AAC 4 62 0 
CACAGCCTCC TGGGTTTAAG CAGTTCACCC TCCCGAGTAG CTGGGACTAC AGGTATGTGC 4680 
CACCACACCC AGCTAATTTT GTGTTTTCAG TAGAGACAGG GTTTCACCAT GTTGGTCAGG 474 0 
CTGGTCTCAA ACTCCTGACC TCAAGCAATC CACCCACCTC AGCCTCCCAA AGTGCTGGGA 4 800 
TTACAGGCAT GAGCCACTGC ACACGGACCA GATCCATTGT TTATGTTGCT TCTAGAGTGA 486 0 
GTTTTTAAAA CACAAATTTG ACCATATCTT TCTCCAATTT AAGTCAGTAT TTTTTTTTTC 4 92 0 
AGGAAAAAAC AGTTCAAACT CTTTAGTCTG CTTACACAAG GCCTTTGTAG TCTGACTCTT 4 980 
CTTTCCAAGC TTTCATCAAA GTATACTGCA AGTTACATTT TATGTGAATT GAATTAGGCA 5 040 
ACGGTATAAA AATTATAGTT T AT ATGGG C A AAATGGAAAT AATGTTAACT CTTCCAAATA 5100 
GTTTATCTAG AATGACATAA TTTCAAAGCT GTCAGGTCAA ATGAGTTATA AACTGTTAAC 516 0 
ACTATTGCCA CATGCAAGTG TCTCTTATAC TTGGTAGAAT TATCTGCTTC CATGTCATTA 522 0 
TTATGTAAAT TAGACTTTAA ATAACTCAGA AGTTCTTCAG ACATACAGGT TATTATTGTG 52 8 0 
CTTTTTAAAC ATAATTTTAA ATAATTTTAT ATATGATAAT GTTATCCAAG TGCTAAGGGA 534 0 
TGTATTGTTA CTGCTGTGCA AAAAAAAAAA AAAAAAAAAC TCCAAATAAA TATGTTGAAA 54 0 0 
CCAAGTTTAT ATGCAAGAAA ACAATATTAA AAAGGCCAAA GTACCACCAT AATAGGCTGT 54 60 
GTGGAGACGG CAGGCTACAA AACACTAGTA ATAATGCTGA GAAAGTTGAA AAAAGAAAGA 5 52 0 
AAGCAACAAT ATGCTTTGGT TGTTGTAGGT TTATGTACTC CAAGAATATC TCCTCTCAAA 5 58 0 
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CTTTTACGTT TTTTCCAAAG AAAAGTTAAC TTTGGCTGGG CGCAGTGGCT CTTGCCTGTA 564 0 
GTCCCAGCCT TTGGGAGGCC AAGGCGGGCA GATCACCTGA GGTCAGGAGT TTGAGACCAG 57 00 
CCTGACCAAA AATGGAGAAA CCCGCCCCCC TCACTACTAA AAGAATACAA AATTAGGCCG 5760 
GGCACAGTGG CTTACCCCTG TGATCCCAGC ACTTTGGGAG GCCGAAGCAG GAAGATCACC 5820 
TGAGGTCAGG AGTTCGAGAC CAGCCATGGA GAAACCCGTC TCTACTAAAA ATACAAAATT 588 0 
AGCCGGGCGT GGTGGTGCAT GACTGTAATC CCAGCTACTC AGGAGGCTAA GGCAGAGAAT 5 94 0 
CACTTGAACC CAGGCAGTGG AGGTTGCAGT GAGCCGAGAT CGTGCCATTG CACTCCAGCC 6000 
TGGGCAACAA GAGCGAAACT CTGTATCCAA AAAACAAAAG AAAAGAAAAG GTAACCTTGA 6 06 0 
ACTATGTGAG ATCTTTAGAA ATGCATTCTT TCTGTAAAAT GTGACTACAT TTGCCTTATT 612 0 
TATGGTAAAA ATGTTGAGGC CTCAAACAAC CCATATTTTC TCGGTCTCCC CGCTGCCTAG 618 0 
CCTTTGTTCA CATTGCTTCT TCTTGGTGGA AGCTCTTCCT CTGGCCTTGA AAATGCCTGC 6 24 0 
TTCTCTTTCA AGGTAGCACA GTCATCACTT TCTGTGGTAA CCTTCTCCAG CACCATCAAA 6 300 
CAGAAAGAAT GAATCTCTTG TAAATTCAGC TCTTACGTCA TTCATTACAT TATTTTGTAA 6 36 0 
CTCTTTATAG ATTCTTCTCT CCCACTAGAC TCTGAGTCAC TGGAGAGTAG GAGCCAACTC 6 420 
TCATTCATGT GTGGTTTGGT CAGCTACTGG CCACATTCCT GATGCATAGT TAATGCTCAA 6 480 
ACCTTAACTG Gf GAATCAGC TCAAATATTG TCCTTCTCTA AATCCATTCA CTCATTGACT 6 54 0 
AACTATGTAC TCAAAATAGT AAACACCAGT AATTTAATCC AATTCCTGCC CATACTGCTT 6 6 00 
GGTACATTTC AGGTGAATTA GTTTGATAAA TATGTGTGTA TTACATAATA TTAAAGTATG 6660 
TACAGAAGAT CATGCTAATC ATAATTCACA ACTGATAACT AATCAAACAT AAATGCTCTC 6720 
AGGTTAACAA ATGTCTGCCT TCTCAGTTAA TGCAGTGATT AACAAACACC TTCTGATGCT 678 0 
GATAATAGGG CCTTGTTCAG CAATGAAGCC ATAAAGGTGA ATAAAGAACA TGCCCTCGTG 6840 
GAGCTCACAG CCTAGTCATT ATTGTTCTGA TTTTTAATAT TAATGTTGGT TTGGGTTTTG 6 900 
GTGAAAAATG TTTAGACTTA TCTTAGTGAT CTTTTCATCC TTTGCTATAT TATTTTTCTC 6 96 0 
TAAGAGTCTT CCTTATCCCC TCCTTTAAAA AACTAGGTGA TAATTCTAAA TTGTAAATTT 702 0 
AAATATTATA AATAGCTTAT AAAATTTAAT ATTTATAATA TTTAAATGTT TGATAAATAT 708 0 
TTAAATTTTA TAATATTTAA ATGTTTATTT AAATTCATTT GTACATCAGT TTTTATTTTA 714 0 
TTTAAATGTG TTGGCCAGGC ATGGTGGCTG ACACCTATAA TCCCAGAACT TTGAGAGGCC 7200 
AAGTCAGGCA AACCATTTGA GCTCAGGAGT TTGAGACCAC CCTGGGCAAC GTGGTGAAAC 726 0 
CCTGTCTCTA CCAAACATAT GAAAACTTAT CTGGGTGTGG TGGCACGCAT CTGTGGTCCC 732 0 
AGATGGGAGT CCCAGGCTAA GATGGGAGAA TCGCTTGAAC C C AGGTG AG A GGGGTGGGGT -7380 
GGATGTTGCA GTGAGCTGAG ATCGTGCCAC TGCACTCCAA CCTGGGTGAC AGAGTGAGAC 7440 
TCCATCTCAA AAAAAAAAAA TGTTATCTAA ATAAGATAAA TTTAATAACT GTTCGCACTT 7500 
AGATGAGCAT AAGGAACTAA ACCTAGATAA AACTATCAAA TAAGGCCTGG GTACAGTGAC 7560 
TCATGCCTGT AATCTCAAGC ACTTTGGGAG GCCAAAATTA TACAAAGTTA GTTGTATAAC 7620 
AC C AACT AAC AACTATTTTG GGGTTAGCTT AATTCAGATT AATTTTTTTT AAACTGAGTT 768 0 
TTAAATTCCT GCTTACTCTA CCATACATGC TAGGCCTCAT ATTATGCTAG AAAAATTTTG 774 0 
AGCACAGATT TATGAATACT CTCCTGCATA CCATTTAATT TTTAAACAAA TTTTAATGCA 7800 
GTATATATGT GCCTTTTTAC CAACACATTA AATAATAAGA TCTACTGTGA GGACTAAATT 7860 
TCTGTAATTT CAAAGTAGTA ATGAGTTTAA ACCATGTCTC AAGATCTCTG CAATAACTGT 792 0 
AG C AC AAC AG AAAATAGGTA TTTCTATTAA TGACAGAGTC ACAAGTACTA CTAATAATAC 7 98 0 
TGTGGTTTGT TTCCTGCAAC TAATCATGGG AGGAATGCTA AATTTCAGAG GTTGGTGAAA 8 04 0 
ATACATGTGT ATTTTTTTCC CCATCCAAGT TCACAGATTT CTCACACTGA GAACTCCTAT 8100 
T C CAT AAC AA AATTCTGGA-A GCCTGCACAC CGTATTGGAA G AAGGG C AG A AAGGAAAAGC 8160 
AAATGGAAGG ATTTAAATTT TTTTCAAATC CTGTATCCCT TGATTTTACA GCAAGATTGT 822 0 
ATTTATGTAT TACTTGTGTT AAAAATATAG TATAATCGAG ACTCCAGATC AAAAATCACC 8 28 0 
GCAGCTCAGG GAGAAAGAGG GCCACCAAAT GCCAGAGCCC TTCAGCCTTC TCCCACCCTG 8 340 
CCTGTACCCT CAGATGGAAG CACTTTTTTA TCATTGTTTC ACCTTTAGCA TTTTGACAAT 8400 
GAAGTCACAA ACCTTCAGCC TCTCACCCAT AGGAACCCAC TGGTTGTAAG AGAAGGATGA 84 60 
AGCCAGTCCT TCCTAAAGGG CACGATTAGA TGTGTTTATG GCATCCTCAG GTGAAACTAT 8 52 0 
ATTTATATTG ACAATATATT TATATTTCTC AAGGAATACT AGAATAATGA TTCAGTTCAG 858 0 
TACTAGGCCA TTTATCTACC CTTTATAATA TTGTTTAATG AGAAAATGCT TTCTATCTTC 864 0 
CAAATATCTG ATGATTTGTA AGAGAACACT TAAACATGGG TATTCATAAG CTGAAACTTC 87 0 0 
TGGCATTTAT TGAATGTCAA GATTGTTCAT CAGTATACTA GGTGATTAAC TGACCACTGA 8 76 0 
ACTTGAAGGT AGTATAAAGT AGTAGTAAAA GGTACAATCA TTGTCTCTTA ACAGATGGCT 882 0 
CTTTGCTTTC ATTAG 883 5 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1371 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 1..1371 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GTAAGGCTAA TGCCATAGAA CAAATACCAG GTTCAGATAA ATCTATTCAA TTAGAAAAGA 6 0 
TGTTGTGAGG TGAACTATTA AGTGACTCTT TGTGTCACCA AATTTCACTG TAATATTAAT 12 0 
GGCTCTTAAA AAAATAGTGG ACCTCTAGAA ATTAACCACA ACATGTCCAA GGTCTCAGCA 180 
CCTTGTCACA CCACGTGTCC TGGCACTTTA ATCAGCAGTA GCTCACTCTC CAGTTGGCAG 24 0 
TAAGTGCACA TCATGAAAAT CCCAGTTTTC ATGGGAAAAT CCCAGTTTTC ATTGGATTTC 3 00 
CATGGGAAAA ATCCCAGTAC AAAACTGGGT GCATTCAGGA AATACAATTT CCCAAAGCAA 360 
ATTGGCAAAT TATGTAAGAG ATTCTCTAAA TTTAGAGTTC CGTGAATTAC ACCATTTTAT 420 
GTAAATATGT TTGACAAGTA AAAATTGATT CTTTTTTTTT TTTTCTGTTG CCCAGGCTGG 480 
AGTGCAGTGG CACAATCTCT GCTCACTGCA ACCTCCACCT CCTGGGTTCA AGCAATTCTC 540 
CTGCCTCAGC CTTCTGAGTA G CTGGG ACT A CAGGTGCATC CCGCCATGCC TGGCTAATTT 600 
TTGGGTATTT TTACTAGAGA CAGGGTTTTG GCATGTTGTC CAGGCTGGTC TTGGACTCCT 660 
GATCTCAGAT GATCCTCCTG GCTCGGGCTC CCAAAGTGCT GGGATTACAG GCATGAACCA 720 
C C AC AC ATGG CCTAAAAATT GATTCTTATG ATTAATCTCC TGTGAACAAT TTGGCTTCAT 78 0 
TTGAAAGTTT GCCTTCATTT GAAACCTTCA TTTAAAAGCC TGAGCAACAA AGTGAGACCC 84 0 
CATCTCTACA AAAAACTGCA AAATATCCTG TGGACACCTC CTACCTTCTG TGGAGGCTGA 900 
AGCAGGAGGA TCACTTGAGC CTAGGAATTT GAGCCTGCAG TGAGCTATGA TCCCACCCCT 960 
ACACTCCAGC CTGCATGACA GTAGACCCTG ACACACACAC ACAAAAAAAA ACCTTCATAA 102 0 
AAAATTATTA GTTGACTTTT CTTAGGTGAC TTTCCGTTTA AGCAATAAAT TT AAAAGT AA 1080 
AATCTCTAAT TTTAGAAAAT TTATTTTTAG TTACATATTG AAATTTTTAA ACCCTAGGTT 1140 
TAAGTTTTAT GTCTAAATTA C CTGAGAAC A C AC T AAGTCT GATAAGCTTC ATTTTATGGG 12 00 
CCTTTTGGAT . G ATT AT AT AA TATTCTGATG AAAGCCAAGA CAGACCCTTA AACCATAAAA 12 60 
ATAGGAGTTC GAGAAAGAGG AGTAGCAAAA GTAAAAGCTA GAATGAGATT GAATTCTGAG 132 0 
TCGAAATACA AAATTTTACA TATTCTGTTT CTCTCTTTTT CCCCCTCTTA G 13 71 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3383 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 1..3383 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID. NO : 12: 

GTAAAGTAGA AATGAATTTA TTTTTCTTTG CAAACTAAGT ATCTGCTTGA GACACATCTA 60 

TCTCACCATT GTCAGCTGAG GAAAAAAAAA AATGGTTCTC ATGCTACCAA TCTGCCTTCA 120 

AAGAAATGTG GACTCAGTAG CACAGCTTTG GAATGAAGAT GATCATAAGA GATACAAAGA 180 

AGAACCTCTA GCAAAAGATG CTTCTCTATG CCTTAAAAAA TTCTCCAGCT CTTAGAATCT 240 

ACAAAATAGA CTTTGCCTGT TTCATTGGTC CTAAGATTAG CATGAAGCCA TGGATTCTGT 3 00 

TGTAGGGGGA GCGTTGCATA GGAAAAAGGG ATTGAAGCAT TAGAATTGTC CAAAATCAGT 360 

AACACCTCCT CTCAGAAATG CTTTGGGAAG AAGCCTGGAA GGTTCCGGGT TGGTGGTGGG 420 

GTGGGGCAGA AAATTCTGGA AGTAGAGGAG ATAGGAATGG GTGGGGCAAG AAG AC CAC AT 480 

TCAGAGGCCA AAAGCTGAAA GAAACCATGG CATTTATGAT GAATTCAGGG TAATTCAGAA 540 

TGGAAGTAGA GTAGGAGTAG GAGACTGGTG AGAGGAGCTA GAGTGATAAA CAGGGTGTAG 6 00 

AGCAAGACGT TCTCTCACCC CAAGATGTGA AATTTGGACT TTATCTTGGA GATAATAGGG 66 0 

TTAATTAAGC ACAATATGTA TTAGCTAGGG TAAAGATTAG TTTGTTGTAA CAAAGACATC 72 0 
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CAAAGATACA GTAGGTGAAT AAGATAGAGA ATTTTTCTCT CAAAGAAAGT CTAAGTAGGC 78 0 
AGCTCAGAAG TAGTATGGCT GGAAGCAACC TGATGATATT GGGACCCCCA ACCTTCTTCA 84 0 
GTCTTGTACC CATCATCCCC TAGTTGTTGA TCTCACTCAC ATAGTTGAAA ATCATCATAC 900 
TTCCTGGGTT CATATCCCAG TTATCAAGAA AGGGTCAAGA GAAGTCAGGC TCATTCCTTT 96 0 
CAAAGACTCT AATTGGAAGT TAAACACATC AATCCCCCTC ATATTCCATT GACTAGAATT 102 0 
TAATCACATG GCCACACCAA GTGCAAGGAA ATCTGGAAAA TATAATCTTT ATTCCAGGTA 108 0 
GCCATATGAC TCTTTAAAAT TCAGAAATAA TATATTTTTA AAATATCATT CTGGCTTTGG 114 0 
TATAAAGAAT TGATGGTGTG GGGTGAGGAG GCCAAAATTA AGGGTTGAGA GCCTATTATT 1200 
TTAGTTATTA CAAGAAATGA TGGTGTCATG AATTAAGGTA GACATAGGGG AGTGCTGATG 126 0 
AGGAGCTGTG AATGGATTTT AGAAACACTT GAGAGAATCA ATAGGACATG ATTTAGGGTT 132 0 
GG ATTTGGAA AGGAGAAGAA AGTAGAAAAG ATGATGCCTA CATTTTTCAC TTAGGCAATT 1380 
TGTACCATTC AGTGAAATAG GGAACACAGG AGGAAGAGCA GGTTTTGGTG TATACAAAGA 144 0 
GGAGGATGGA TGACGCATTT CGTTTTGGAT CTGAGATGTC TGTGGAACGT CCTAGTGGAG 1500 
ATGTCCACAA ACTCTTCTAC ATGTGGTTCT GAGTTCAGGA CACAGATTTG GGCTGGAGAT 156 0 
AGAGATATTG TAGGCTTATA CATAGAAATG GCATTTGAAT CTATAGAGAT AAAAAGACAC 162 0 
ATCAGAGGAA ATGTGTAAAG TGAGAGAGGA AAAGCCAAGT ACTGTGCTGG GGGGAATACC 168 0 
TACATTTAAA GGATGCAGTA GAAAGAAGCT AATAAACAAC AGAGAGCAGA CTAACCAAAA 174 0 
GGGGAGAAGA AAAACCAAGA GAATTCCACC GACTCCCAGG AGAGCATTTC AAGATTGAGG 1800 
GGATAGGTGT TGTGTTGAAT TTTGCAGCCT TGAGAATCAA GGGCCAGAAC ACAGCTTTTA 1860 
GATTTAGCAA CAAGGAGTTT GGTGATCTCA GTGAAAGCAG CTTGATGGTG AAATGGAGGC 192 0 
AGAGGCAGAT TGCAATGAGT GAAACAGTGA ATGGGAAGTG AAGAAATGAT ACAGATAATT 198 0 
CTTGCTAAAA GCTTGGCTGT TAAAAGGAGG AGAGAAACAA GACTAGCTGC AAAGTGAGAT 2 040 
TGGGTTGATG GAGCAGTTTT AAATCTCAAA ATAAAGAGCT TTGTGCTTTT TTGATTATGA 2100 
AAATAATGTG TTAATTGTAA CTAATTGAGG CAATGAAAAA AGATAATAAT ATGAAAGATA 2160 
AAAATATAAA AACCACCCAG AAATAATGAT AGCTACCATT TTGATACAAT ATTTCTACAC 2220 
TCCTTTCTAT GTATATATAC AGACACAGAA ATGCTTATAT TTTTATTAAA AGGGATTGTA 22 80 
CTATACCTAA GCTGCTTTTT CTAGTTAGTG ATATATATGG ACATCTCTCC ATGGCAACGA 2 340 
GTAATTGCAG TTATATTAAG TTCATGATAT TTCACAATAA GGGCATATCT TTGCCCTTTT 24 00 
TATTTAATCA ATTCTTAATT GGTGAATGTT TGTTTCCAGT TTGTTGTTGT TATTAACAAT 24 60 
' GTTCCCATAA GCATTCCTGT ACACCAATGT TCACACATTT GTCTGATTTT TTCTTCAGGA 2 52 0 
TAAAACCCAG GAGGTAGAAT TGCTGGGTTG ATAGAAGAGA AAGGATGATT GCCAAATTAA 2 580 
AGCTTCAGTA GAGGGTACAT GCCGAGCACA AATGGGATCA GCCCTAGATA CCAGAAATGG 2 640 
CACTTTCTCA TTTCCCCTTG GGACAAAAGG GAGAGAGGCA ATAACTGTGC TGCCAGAGTT 2 7 00 
AAATTTGTAC GTGGAGTAGC AGGAAATCAT TTGCTGAAAA TGAAAACAGA GATGATGTTG 2760 
TAGAGGTCCT GAAGAGAGCA AAGAAAATTT GAAATTGCGG CTATCAGGTA TGGAAGAGAG 2 820 
TGCTGAACTG GAAAACAAAA GAAGTATTGA CAATTGGTAT GCTTGTAATG GCACCGATTT 28 80 
GAACGCTTGT GCCATTGTTC ACCAGCAGCA CTCAGCAGCC AAGTTTGGAG TTTTGTAGCA 2 94 0 
GAAAGACAAA TAAGTTAGGG ATTTAATATC CTGGCCAAAT GGTAGACAAA ATGAACTCTG 3 000 
AGATCCAGCT GCACAGGGAA GGAAGGGAAG ACGGGAAGAG GTTAGATAGG AAATACAAGA 3 060 
GTCAGGAGAC TGGAAGATGT TGTGATATTT AAGAACACAT AGAGTTGGAG TAAAAGTGTA 312 0 
AGAAAACTAG AAGGGTAAGA GACCGGTCAG AAAGTAGGCT ATTTGAAGTT AACACTTCAG 3180 
AGGCAGAGTA GTTCTGAATG GTAACAAGAA ATTGAGTGTG CCTTTGAGAG TAGGTTAAAA 3 24 0 
AACAATAGGC AACTTTATTG TAGCTACTTC TGGAACAGAA GATTGTCATT AATAGTTTTA 3300 
GAAAACTAAA ATATATAGCA TACTTATTTG TCAATTAACA AAGAAACTAT GTATTTTTAA 33 6 0 
ATGAGATTTA ATGTTTATTG TAG 3 3 83 

(2) INFORMATION FOR SEQ ID NO: 13: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11464 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME/KEY: 5 1 UTR 

(B) LOCATION: 1..3 

(C) IDENTIFICATION METHODS : E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 4 . . 82 

(C) IDENTIFICATION METHODS: S 
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(A) NAME /KEY : intron 

(B) LOCATION: 83.. 1453 

(C) IDENTIFICATION METHODS : E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 1454.. 1465 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

<B) LOCATION: 1466.. 4848 
(C) IDENTIFICATION METHODS: E 
. (A) NAME /KEY : leader peptide 

(B) LOCATION: 4849.. 4865 

(C) IDENTIFICATION METHODS: S 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 4866.. 4983 

(C) IDENTIFICATION METHODS: S 

(A) NAME/KEY: intron 

(B) LOCATION: 4984.. 6317 

(C) IDENTIFICATION METHODS : E 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 6318.. 6451 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 6452.. 11224 

(C) IDENTIFICATION METHODS: E 

(A) NAME / KEY : mat peptide 

(B) LOCATION: 11225.. 11443 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : 3 1 UTR 

(B) LOCATION: 11444.. 11464 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA 4 8 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala 
-35 -30 -25 
ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G GTAAGG CTAATGCCAT 98 
Met Lys Phe lie Asp Asn Thr Leu Tyr Phe He Ala 
-20 -15 -10 

AGAACAAATA CCAGGTTCAG ATAAATCTAT TCAATTAGAA AAGATGTTGT GAGGTGAACT 15 8 

ATTAAGTGAC TCTTTGTGTC ACCAAATTTC ACTGTAATAT TAATGGCTCT TAAAAAAATA 218 

GTGGACCTCT AGAAATTAAC CACAACATGT CCAAGGTCTC AGCACCTTGT CACACCACGT 278 

GTCCTGGCAC TTT AAT C AG C AGTAGCTCAC TCTCCAGTTG GCAGTAAGTG CACATCATGA 33 8 

AAATCCCAGT TTTCATGGGA AAATCCCAGT TTTCATTGGA TTTCCATGGG AAAAATCCCA 3 98 

GTACAAAACT GGGTGCATTC AGGAAATACA ATTTCCCAAA GCAAATTGGC AAATTATGTA 4 58 

AGAGATTCTC TAAATTTAGA GTTCCGTGAA TT AC AC C ATT TTATGTAAAT ATGTTTGACA 518 

AGTAAAAATT GATTCTTTTT TTTTTTTTCT GTTGCCCAGG CTGGAGTGCA GTGGCACAAT 57 8 

CTCTGCTCAC TGCAACCTCC ACCTCCTGGG TTCAAGCAAT TCTCCTGCCT CAGCCTTCTG 63 8 

AGTAGCTGGG ACTACAGGTG CATCCCGCCA TGCCTGGCTA ATTTTTGGGT ATTTTTACTA 698 

GAGACAGGGT TTTGGCATGT TGTCCAGGCT GGTCTTGGAC TCCTGATCTC AGATGATCCT 75 8 

CCTGGCTCGG GCTCCCAAAG TGCTGGGATT ACAGGCATGA ACCACCACAC ATGGCCTAAA 818 

AATTGATTCT TATGATTAAT CTCCTGTGAA CAATTTGGCT TCATTTGAAA GTTTGCCTTC 878 

ATTTGAAACC TTCATTTAAA AGCCTGAGCA AC AAAG TG AG ACCCCATCTC TACAAAAAAC 938 

TGCAAAATAT CCTGTGGACA CCTCCTACCT TCTGTGGAGG CTGAAGCAGG AGGATCACTT 998 

GAGCCTAGGA ATTTGAGCCT GCAGTGAGCT ATG ATC CC AC CCCTACACTC CAGCCTGCAT 1058 

GACAGTAGAC CCTGACACAC ACACACAAAA AAAAACCTTC ATAAAAAATT ATTAGTTGAC 1118 

TTTTCTTAGG TGACTTTCCG TTTAAGCAAT AAATTTAAAA GTAAAATCTC TAATTTTAGA 1178 

AAATTTATTT TTAGTTACAT ATTGAAATTT TTAAACCCTA GGTTTAAGTT TTATGTCTAA 12 3 8 

ATTACCTGAG AACACACTAA GTCTGATAAG CTTCATTTTA TGGGCCTTTT GGATGATTAT 12 98 

ATAATATTCT G ATG AAAG CC AAGACAGACC CTTAAACCAT AAAAATAGGA GTTCGAGAAA 1358 

GAGGAGTAGC AAAAGTAAAA GCTAGAATGA GATTGAATTC TGAGTCGAAA TACAAAATTT 1418 

TACATATTCT GTTTCTCTCT TTTTCCCCCT CTTAG CT GAA GAT GAT G GTAAA 14 70 

Ala Glu Asp Asp Glu 
-10 

GTAGAAATGA ATTTATTTTT CTTTGCAAAC TAAGTATCTG CTTGAGACAC ATCTATCTCA 1530 

CCATTGTCAG CTGAGGAAAA AAAAAAATGG TTCTCATGCT ACCAATCTGC CTTCAAAGAA 1590 

ATGTGGACTC AGT AG CAC AG CTTTGGAATG AAGATGATCA TAAGAGATAC AAAGAAGAAC 1650 
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CTCTAGCAAA AGATGCTTCT CTATGCCTTA AAAAATTCTC CAGCTCTTAG AATCTACAAA 
ATAGACTTTG CCTGTTTCAT TGGTCCTAAG ATTAGCATGA AGCCATGGAT TCTGTTGTAG 
GGGGAGCGTT GCATAGGAAA AAGGGATTGA AGCATTAGAA TTGTCCAAAA TCAGTAACAC 
CTCCTCTCAG AAATGCTTTG GGAAGAAGCC TGGAAGGTTC CGGGTTGGTG GTGGGGTGGG 
GCAGAAAATT CTGGAAGTAG AGGAGATAGG AATGGGTGGG GCAAGAAGAC CACATTCAGA 
GGCCAAAAGC TGAAAGAAAC CATGGCATTT ATGATGAATT CAGGGTAATT CAGAATGGAA 
GTAGAGTAGG AGTAGGAGAC TGGTGAGAGG AGCTAGAGTG ATAAACAGGG TGTAGAGCAA 
GACGTTCTCT CACCCCAAGA TGTGAAATTT GGACTTTATC TTGGAGATAA TAGGGTTAAT 
TAAGCACAAT ATGTATTAGC TAGGGTAAAG ATTAGTTTGT TGTAACAAAG AC AT C C AAAG 
ATACAGTAGC TGAATAAGAT AGAGAATTTT TCTCTCAAAG AAAGTCTAAG TAGGCAGCTC 
AGAAGTAGTA TGGCTGGAAG CAACCTGATG ATATTGGGAC CCCCAACCTT CTTCAGTCTT 
GTACCCATCA TCCCCTAGTT GTTGATCTCA CTCACATAGT TGAAAATCAT CATACTTCCT 
GGGTTCATAT CCCAGTTATC AAGAAAGGGT CAAGAGAAGT CAGGCTCATT CCTTTCAAAG 
ACTCTAATTG GAAGTTAAAC ACATCAATCC CCCTCATATT CCATTGACTA GAATTTAATC 
ACATGGCCAC ACCAAGTGCA AGGAAATCTG GAAAATATAA TCTTTATTCC AGGTAGCCAT 
ATGACTCTTT AAAATTCAGA AATAATATAT TTTTAAAATA TCATTCTGGC " TTTGGTATAA 
AGAATTGATG GTGTGGGGTG AGGAGGCCAA AATTAAGGGT TGAGAGCCTA TTATTTTAGT 
TATTACAAGA AATGATGGTG TCATGAATTA AGGTAGACAT AGGGGAGTGC TGATGAGGAG 
CTGTGAATGG ATTTTAGAAA CACTTGAGAG AATCAATAGG ACATGATTTA GGGTTGGATT 
TGGAAAGGAG AAGAAAGTAG AAAAGATGAT GCCTACATTT TTCACTTAGG CAATTTGTAC 
CATTCAGTGA AATAGGGAAC ACAGGAGGAA GAGCAGGTTT TGGTGTATAC AAAGAGGAGG 
ATGGATGAGG CATTTCGTTT TGGATCTGAG ATGTCTGTGG AACGTCCTAG TGGAGATGTC 
CACAAACTCT TCTACATGTG GTTCTGAGTT CAGGACACAG ATTTGGGCTG GAGATAGAGA 
TATTGTAGGC TTATACATAG AAATGGCATT TGAATCTATA GAGATAAAAA GACACATCAG 
AGGAAATGTG TAAAGTGAGA GAGGAAAAGC CAAGTACTGT GCTGGGGGGA ATACCTACAT 
TTAAAGGATG CAGTAGAAAG AAGCTAATAA ACAACAGAGA GCAGACTAAC CAAAAGGGGA 
GAAGAAAAAC CAAGAGAATT CCACCGACTC CCAGGAGAGC ATTTCAAGAT TGAGGGGATA 
GGTGTTGTGT TGAATTTTGC AGCCTTGAGA ATCAAGGGCC AG AAC AC AG C TTTTAGATTT 
AGCAACAAGG AGTTTGGTGA TCTCAGTGAA AGCAGCTTGA TGGTGAAATG GAGGCAGAGG 
CAGATTGCAA TGAGTGAAAC AGTGAATGGG AAGTGAAGAA ATGATACAGA TAATTCTTGC 
TAAAAGCTTG GCTGTTAAAA GGAGGAGAGA AACAAGACTA GCTGCAAAGT GAGATTGGGT 
TGATGGAGCA GTTTTAAATC TCAAAATAAA GAGCTTTGTG CTTTTTTGAT TATGAAAATA 
ATGTGTTAAT TGTAACTAAT TGAGGCAATG AAAAAAGATA ATAATATGAA AGATAAAAAT 
ATAAAAACCA CCCAGAAATA ATGATAGCTA CCATTTTGAT ACAATATTTC TACACTCCTT 
TCTATGTATA TAT AC AG AC A CAGAAATGCT TATATTTTTA TTAAAAGGGA TTGTACTATA 
CCTAAGCTGC TTTTTCTAGT TAGTGATATA TATGGACATC TCTCCATGGC AACGAGTAAT 
TGCAGTTATA TTAAGTTCAT GATATTTCAC AATAAGGGCA TATCTTTGCC CTTTTTATTT 
AATCAATTCT TAATTGGTGA ATGTTTGTTT CCAGTTTGTT GTTGTTATTA ACAATGTTCC 
CATAAGCATT CCTGTACACC AATGTTCACA CATTTGTCTG ATTTTTTCTT CAGGATAAAA 
CCCAGGAGGT AGAATTGCTG GGTTGATAGA AGAGAAAGGA TGATTGCCAA . ATT AAAG CTT 
CAGTAGAGGG TACATGCCGA GCACAAATGG GATCAGCCCT AGATACCAGA AATGC3CACTT 
TCTCATTTCC CCTTGGGACA AAAGGGAGAG AGGCAATAAC TGTGCTGCCA GAGTTAAATT 
TGTACGTGGA GTAGCAGGAA ATCATTTGCT GAAAATGAAA ACAGAGATGA TGTTGTAGAG 
GTCCTGAAGA GAG C AAAG AA AATTTGAAAT TGCGGCTATC AGCTATGGAA GAGAGTGCTG 
AACTGGAAAA CAAAAGAAGT ATTGACAATT GGTATGCTTG TAATGGCACC GATTTGAACG 
CTTGTGCCAT TGTTCACCAG CAGCACTCAG CAGCCAAGTT TGGAGTTTTG TAG C AG AAAG 
ACAAATAAGT TAGGGATTTA ATATCCTGGC CAAATGGTAG ACAAAATGAA CTCTGAGATC 
CAGCTGCACA GGGAAGGAAG GGAAGACGGG AAGAGGTTAG ATAGGAAATA CAAGAGTCAG 
GAGACTGGAA GATGTTGTGA TATTTAAGAA CACATAGAGT TGGAGTAAAA GTGTAAGAAA 
ACTAGAAGGG TAAGAGACCG GTCAGAAAGT AGGCTATTTG AAGTTAACAC TTCAGAGGCA 
GAGTAGTTCT GAATGGTAAC AAGAAATTGA GTGTGCCTTT GAGAGTAGGT TAAAAAACAA 
TAGGCAACTT TATTGTAGCT ACTTCTGGAA CAGAAGATTG TCATTAATAG TTTTAGAAAA 
CTAAAATATA TAG C ATACTT ATTTGTCAAT TAACAAAGAA ACTATGTATT TTTAAATGAG 
ATTTAATGTT TATTGTAG AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT 

Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu 
-5 15 
GAA TCT AAA TTA TCA GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC 
Glu Ser Lys Leu Ser Val He Arg Asn Leu Asn Asp Gin Val Leu Phe 

10 15 20 

ATT GAC CAA GGA AAT CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC 
He Asp Gin Gly Asn Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp 

25 30 35 

TGT AGA G GTATTTTTT TTAATTCGCA AACATAGAAA TGACTAGCTA CTTCTTCCCA 
Cys Arg Asp 
40 

TTCTGTTTTA CTGCTTACAT TGTTCCGTGC TAGTCCCAAT CCTCAGATGA AAAGTCACAG 
GAGTGACAAT AATTTCACTT ACAGGAAACT TTATAAGGCA TCCACGTTTT TTAGTTGGGG 



1710 

1770 

1830 

1890 

1950 

2010 

2070 

2130 

2190 

2250 

2310 

2370 

2430 

2490 

2550 

2610 

2670 

2730 

2790 

2850 

2910 

2970 

3030 

3090 

3150 

3210 

3270 

3330 

3390 

3450 

3510 

3570 

3630 

3690 

3750 

3810 

3870 

3930 

3990 

4050 

4110 

4170 

4230 

4290 

4350 

4410 

4470 

4530 

4590 

4650 

4710 

4770 

4830 

4880 



4928 



4976 



5032 



5092 
5152 
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TAAAAAATTG GATACAATAA GACATTGCTA GGGGTCATGC CTCTCTGAGC CTGCCTTTGA 
ATCACCAATC CCTTTATTGT GATTGCATTA ACTGTTTAAA ACCTCTATAG TTGGATGCTT 
AATCCCTGCT TGTTACAGCT GAAAATGCTG AT AGTTTACC AGGTGTGGTG GCATCTATCT 
GTAATCCTAG CTACTTGGGA GGCTCAAGCA GGAGGATTGC TTGAGGCCAG GACTTTGAGG 
CTGTAGTACA CTGTGATCGT ACCTGTGAAT AGCCACTGCA CTCCAGCCTG GGTGATATAC 
AGACCTTGTC TCTAAAATTA AAAAAAAAAA AAAAAAAAAC CTTAGGAAAG GAAATTGATC 
AAGTCTACTG TGCCTTCCAA AACATGAATT CCAAATATCA AAGTTAGGCT GAGTTGAAGC 
AGTGAATGTG CATTCTTTAA AAATACTGAA TACTTACCTT AACATATATT TTAAATATTT 
TATTTAGCAT TTAAAAGTTA AAAACAATCT TTTAGAATTC ATATCTTTAA AATACTCAAA 
AAAGTTGCAG CGTGTGTGTT GTAATACACA TTAAACTGTG GGGTTGTTTG TTTGTTTGAG 
ATGCAGTTTC ACTCTGTCAC CCAGGCTGAA GTGCAGTGCA GTGCAGTGGT GTGATCTCGG 
CTCACTACAA CCTCCACCTC CCACGTTCAA GCGATTCTCA TGCCTCAGTC TCCCGAGTAG 
GTGGGATTAC AGGCATGCAC CACTTACACC CGGCTAATTT TTGTATTTTT AGTAGAGCTG 
GGGTTTCACC ATGTTGGCCA GGCTGGTCTC AAACCCCTAA CCTCAAGTGA TCTGCCTGCC 
TCAGCCTCCC AAACAAACAA ACAACCCCAC AGTTTAATAT GTGTTACAAC ACACATGCTG 
CAACTTTTAT GAGTATTTTA ATGATATAGA TTATAAAAGG TTGTTTTTAA CTTTTAAATG 
CTGGGATTAC AGGCATGAGC CACTGTGCCA GGCCTGAACT GTGTTTTTAA AAATGTCTGA 
CCAGCTGTAC ATAGTCTCCT GCAGACTGGC CAAGTCTCAA AGTGGGAACA GGTGTATTAA 
GGACTATCCT TTGGTTAAAT TTCCGCAAAT GTTCGTGTGC AAGAATTCTT CTAACTAGAG 
TTCTCATTTA TTATATTTAT TTCAG AT AAT GCA CCC CGG ACC ATA TTT ATT 

Asp Asn Ala Pro Arg Thr lie Phe lie 
40 45 
ATA AGT ATG TAT AAA GAT AGC CAG CCT AGA GGT ATG GCT GTA ACT ATC 
lie Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met Ala Val Thr lie 

50 ** 55 6 0 

TCT GTG AAG TGT GAG AAA ATT TCA ACT CTC TCC TGT GAG AAC AAA ATT 
Ser Val Lys Cys Glu Lys lie Ser Thr Leu Ser Cys Glu Asn Lys lie 
65 " " 70 75 80 

ATT TCC TTT AAG GTAAG ACTGAGCCTT ACTTTGTTTT CAATCATGTT AATATAATCA 
He Ser Phe Lys 

ATATAATTAG AAATATAACA TTATTTCTAA TGTTAATATA AGTAATGTAA TTAGAAAACT 
CAAATATCCT CAGACCAACC TTTTGTCTAG AACAGAAATA AC AAG AAG C A GAGAACCATT 
AAAGTGAATA CTTACTAAAA ATTATCAAAC TCTTTACCTA TTGTGATAAT GATGGTTTTT 
CTGAGCCTGT CACAGGGGAA GAGGAGATAC AACACTTGTT TTATGACCTG CATCTCCTGA 
AC AAT C AGT C TTTATACAAA TAATAATGTA GAATACATAT GTGAGTTATA CATTTAAGAA 
TAACATGTGA CTTTCCAGAA TGAGTTCTGC TATGAAGAAT GAAGCTAATT ATCCTTCTAT 
ATTTCTACAC CTTTGTAAAT TATGATAATA TTTTAATCCC TAGTTGTTTT GTTGCTGATC 
CTTAGCCTAA GTCTTAGACA CAAGCTTCAG CTTCCAGTTG ATGTATGTTA TTTTTAATGT 
TAATCTAATT GAATAAAAGT TATGAGATCA GCTGTAAAAG TAATGCTATA ATTATCTTCA 
AGCCAGGTAT AAAGTATTTC TGGCCTCTAC TTTTTCTCTA TTATTCTCCA TTATTATTCT 
CTATTATTTT TCTCTATTTC CTC C ATT ATT GTTAGATAAA CCACAATTAA CTATAGCTAC 
AGACTGAGCC AGTAAGAGTA GCCAGGGATG CTTACAAATT GGCAATGCTT CAGAGGAGAA 
TTCCATGTCA TGAAGACTCT TTTTGAGTGG AGATTTGCCA AT AAAT AT C C GCTTTCATGC 
CCACCCAGTC CCCACTGAAA GACAGTTAGG ATATGACCTT AGTGAAGGTA CCAAGGGGCA 
ACTTGGTAGG GAGAAAAAAG CCACTCTAAA ATATAATCCA AGTAAGAACA GTGCATATGC 
AACAGATACA GCCCCCAGAC AAATCCCTCA GCTATCTCCC TCC AAC CAG A GTGCCACCCC 
TTCAGGTGAC AATTTGGAGT CCCCATTCTA GACCTGACAG GCAGCTTAGT TATCAAAATA 
GCATAAGAGG CCTGGGATGG AAGGGTAGGG TGGAAAGGGT TAAGCATGCT GTTACTGAAC 
AACATAATTA GAAGGGAAGG AGATGGCCAA GCTCAAGCTA TGTGGGATAG AGGAAAACTC 
AGCTGCAGAG GCAGATTCAG AAACTGGGAT AAGTCCGAAC CTACAGGTGG ATTCTTGTTG 
AGGGAGACTG GTGAAAATGT TAAGAAGATG GAAATAATGC TTGGCACTTA GTAGGAACTG 
GGCAAATCCA TATTTGGGGG AGCCTGAAGT TTATTCAATT TTGATGGCCC TTTTAAATAA 
AAAGAATGTG GCTGGGCGTG GTGGCTCACA CCTGTAATCC CAGCACTTTG GGAGGCCGAG 
GGGGGCGGAT CACCTGAAGT CAGGAGTTCA AGACCAGCCT GACCAACATG GAGAAACCCC 
ATCTCTACTA AAAATACAAA ATTAGCTGGG CGTGGTGGCA TATGCCTGTA ATCCCAGCTA 
CTCGGGAGGC TGAGGCAGGA GAATCTTTTG AACCCGGGAG GCAGAGGTTG CGATGAGCCT 
AGATCGTGCC ATTGCACTCC AGCCTGGGCA ACAAGAGCAA AACTCGGTCT CAAAAAAAAA 
AAAAAAAAAG TG AAAT T AAC CAAAGGCATT AGCTTAATAA TTT AAT AC TG TTTTTAAGTA 
GGGCGGGGGG TGGCTGGAAG AGATCTGTGT AAATGAGGGA ATCTGACATT TAAGCTTCAT 
CAGCATCATA GCAAATCTGC TTCTGGAAGG AACTCAATAA ATATTAGTTG GAGGGGGGGA 
GAGAGTGAGG GGTGGACTAG GACCAGTTTT AGCCCTTGTC TTTAATCCCT TTTCCTGCCA 
CTAATAAGGA TCTTAGCAGT GGTTATAAAA GTGGCCTAGG TTCTAGATAA TAAGATACAA 
CAGGCCAGGC ACAGTGGCTC ATG CCT AT AA TCCCAGCACT TTGGGAGGGC AAGGCGAGTG 
TCTCACTTGA GATCAGGAGT TCAAGACCAG CCTGGCCAGC ATGGCGATAC TCTGTCTCTA 
CTAAAAAAAA TACAAAAATT AG C C AGG CAT GGTGGCATGC ACCTGTAATC CCAGCTACTC 
GTGAGCCTGA GGCAGAAGAA TCGCTTGAAA CCAGGAGGTG TAGGCTGCAG TGAGCTGAGA 
TCGCACCACT GCACTCCAGC CTGGGCGACA GAATGAGACT TTGTCTCAAA AAAAGAAAAA 



5212 

5272 

5332 

5392 

5452 

5512 

5572 

5632 

5692 

5752 

5812 

5872 

5932 

5992 

6052 

6112 

6172 

6232 

6292 

6343 



6391 



6439 



6496 

6556 

6616 

6676 

6736 

6796 

6856 

6916 

6976 

7036 

7096 

7156 

7216 

7276 

7336 

7396 

7456 

7516 

7576 

7636 

7696 

7756 

7816 

7876 

7936 

7996 

8056 

8116 

8176 

8236 

8296 

8356 

8416 

8476 

8536 

8596 

8656 

8716 
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GATACAACAG GCTACCCTTA TGTGCTCACC TTTCACTGTT G ATT AC TAG C TATAAAGTCC 
TATAAAGTTC TTTGGTCAAG AACCTTGACA ACACTAAGAG GGATTTGCTT TGAGAGGTTA 
CTGTCAGAGT CTGTTTCATA TATATACATA TACATGTATA TATGTATCTA TATCCAGGCT 
TGGCCAGGGT TCCCTCAGAC TTTCCAGTGC ACTTGGGAGA TGTTAGGTCA ATATCAACTT 
TCCCTGGATT CAGATTCAAC CCCTTCTGAT GTAAAAAAAA AAAAAAAAAA GAAAGAAATC 
CCTTTCCCCT TGGAGCACTC AAGTTTCACC AGGTGGGGCT TTCCAAGTTG GGGGTTCTCC 
AAGGTCATTG GGATTGCTTT CACATCCATT TGCTATGTAC CTTCCCTATG ATGGCTGGGA 
GTGGTCAACA TCAAAACTAG GAAAGCTACT GCCCAAGGAT GTCCTTACCT CTATTCTGAA 
ATGTGCAATA AGTGTGATTA AAGAGATTGC CTGTTCTACC TATCCACACT CTCGCTTTCA 
ACTGTAACTT TCTTTTTTTC TTTTTTTCTT TTTTTCTTTT TTTTTGAAAC GGAGTCTCGC 
TCTGTCGCCC AGGCTAGAGT GCAGTGGCAC GATCTCAGCT CACTGCAAGC TCTGCCTCCC 
GGGTTCACGC CATTCTCCTG CCTCACCCTC CCAAGCAGCT GGG AC T AC AG GCGCCTGCCA 
CCATGCCCAG CTAATTTTTT GTATTTTTAG TAGAGACGGG GTTTCACCGT GTTAGCCAGG 
ATGGTCTCGA TCTCCTGAAC TTGTGATCCG CCCGCCTCAG CCTCCCAAAG TGCTGGGATT 
ACAGGCGTGA GCCATCGCAC CCGGCTCAAC TGTAACTTTC TATACTGGTT CATCTTCCCC 
TGTAATGTTA CTAGAGCTTT TGAAGTTTTG GCTATGGATT ATTTCTCATT TATACATTAG 
ATTTCAGATt AGTTCCAAAT TGATGCCCAC AGCTTAGGGT CTCTTCCTAA ATTGTATATT 
GT AG AC AG C T GCAGAAGTGG GTGCCAATAG GGGAACTAGT TTATACTTTC ATCAACTTAG 
GACCCACACT TGTTGATAAA GAACAAAGGT CAAGAGTTAT GACTACTGAT TCCACAACTG 
ATTGAGAAGT TGGAGATAAC CCCGTGACCT CTGCCATCCA GAGTCTTTCA GGCATCTTTG 
AAGGATGAAG AAATGCTATT TTAATTTTGG AGGTTTCTCT ATCAGTGCTT AGGATCATGG 
GAATCTGTGC TGCCATGAGG CCAAAATTAA GTCCAAAACA TCTACTGGTT CCAGGATTAA 
CATGGAAGAA CCTTAGGTGG TGCCCACATG TTCTGATCCA TCCTGCAAAA TAGACATGCT 
GCACTAACAG GAAAAGTGCA GGCAGCACTA CCAGTTGGAT AACCTGCAAG ATTATAGTTT 
CAAGTAATCT AACCATTTCT CACAAGGCCC TATTCTGTGA CTGAAACATA CAAGAATCTG 
CATTTGGCCT TCTAAGGCAG GGCCCAGCCA AGGAGACCAT ATTCAGGACA GAAATTCAAG 
ACTACTATGG AACTGGAGTG CTTGGCAGGG AAGACAGAGT CAAGGACTGC CAACTGAGCC 
AATACAGCAG GC TT AC AC AG GAACCCAGGG CCTAGCCCTA CAACAATTAT TGGGTCTATT 
CACTGTAAGT TTTAATTTCA GGCTCCACTG AAAGAGTAAG CTAAGATTCC TGGCACTTTC 
TGTCTCTCTC ACAGTTGGCT CAGAAATGAG AACTGGTCAG GCCAGGCATG GTGGCTTACA 
CCTGGAATCC CAGCACTTTG GGAGGCCGAA GTGGGAGGGT CACTTGAGGC CAGGAGTTCA 
GGACCAGCTT AGGCAACAAA GTGAGATACC CCCTGACCCC TTCTCTACAA AAATAAATTT 
TAAAAATTAG CCAAATGTGG TGGTGTATAC TTACAGTCCC AGCTACTCAG GAGGCTGAGG 
CAGGGGGATT GCTTGAGCCC AGGAATTCAA GGCTGCAGTG AGCTATGATT TCACCACTGC 
ACTTCTGGCT GGGCAACAGA GCGAGACCCT GTCTCAAAGC AAAAAGAAAA AGAAACTAGA 
ACTAGCCTAA GTTTGTGGGA GGAGGTCATC ATCGTCTTTA GCCGTGAATG GTTATTATAG 
AGGACAGAAA TTGACATTAG CCCAAAAAGC TTGTGGTCTT TGCTGGAACT CTACTTAATC 
TTGAGCAAAT GTGGACACCA CTCAATGGGA GAGGAGAGAA GTAAGCTGTT TGATGTATAG 
GGGAAAACTA GAGGCCTGGA ACTGAATATG CATCCCATGA CAGGGAGAAT AGGAGATTCG 
GAGTTAAGAA GGAGAGGAGG TCAGTACTGC TGTTCAGAGA TTTTTTTTAT GTAACTCTTG 
AGAAGCAAAA CTACTTTTGT TCTGTTTGGT AATATACTTC AAAACAAACT TCATATATTC 
AAATTGTTCA TGTCCTGAAA TAATTAGGTA ATGTTTTTTT CTCTATAG GAA ATG AAT 

Glu Met Asn 
85 

CCT CCT GAT AAC ATC AAG GAT ACA AAA AGT GAC ATC ATA TTC TTT CAG 
Pro Pro Asp Asn lie Lys Asp Thr Lys Ser Asp He He Phe Phe Glu 

90 95 100 

AGA AGT GTC CCA GGA CAT GAT AAT AAG ATG CAA TTT GAA TCT TCA TCA 
. Arg Ser Val Pro Gly His Asp Asn Lys Met Gin Phe Glu Ser Ser Ser 



105 HO 
TAC GAA GGA TAC TTT CTA GCT TGT 
Tyr Glu Gly Tyr Phe Leu Ala Cys Glu Lys 
120 ' ' 125 

CTC ATT TTG AAA AAA GAG GAT GAA TTG GGG _ 
Leu He Leu Lys Lys Glu Asp Glu Leu Gly Asp Arg 

140 145 
ACT GTT CAA AAC GAA GAC TAGCTATTAA AATTTCATGC C 
Thr Val Gin Asn Glu Asp 
155 



115 

GAA AAA GAG AGA GAC CTT TTT AAA 



Glu Arg Asp Leu Phe Lys 
130 135 
GAT AGA TCT ATA ATG TTC 
Ser He Met Phe 
150 



8776 
8836 
8896 
8956 
9016 
9076 
9136 
9196 
92 56 
9316 
9376 
9436 
9496 
9556 
9616 
9676 
9736 
9796 
9856 
9916 
9976 
10036 
10096 
10156 
10216 
10276 
10336 
10396 
10456 
10516 
10576 
10636 
10696 
10756 
10816 
10876 
10936 
10996 
11056 
11116 
11176 
11233 



11281 
11329 
11377 
11425 
11464 



(2) INFORMATION FOR SEQ ID NO : 14 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28994 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TO POLOGY : linear 
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(ii) MOLECULE TYPE: Genomic DNA 



(vi) ORIGINAL SOURCE : 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : 5 1 UTR 

(B) LOCATION: 1.. 15606 

(C) IDENTIFICATION METHODS : E 

(A) NAME / KEY : leader peptide 

(B) LOCATION: 15607.. 15685 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 15686.. 17056 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 17057.. 17068 

(C) IDENTIFICATION METHODS : S 

(A) NAME/KEY: intron 

(B) LOCATION: 17069.. 20451 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 20452.. 20468 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 20469 .. 20586 * 

(C) IDENTIFICATION METHODS: S 
(A) NAME /KEY : intron 

"(B) LOCATION: 20587.. 21920 

(C) IDENTIFICATION METHODS: E 

(A) NAME / KEY : mat peptide 

(B) LOCATION: 21921.. 22054 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 22055.. 26827 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : mat* peptide 

(B) LOCATION: 26828.. 27046 

(C) IDENTIFICATION METHODS : S 

(A) NAME /KEY : 3 ■ UTR 

(B) LOCATION: 27047.. 28994 

(C) IDENTIFICATION METHODS: E 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ACTTGCCTTA AAAGCTTTGC ATAGGTAGAC AACATTAGAT TAATTTCCTT GCTCACATCT 6 0 

GTTCAAGAAA AATCATTTAA GTTATAAAAT ATAACAAACC TTCTGCATTA TAAGACTGAT '12 0 

GTTTAGAAAT ATAAACATTT TATACATCAC CATTTAAATC TTTCTCCAAG GCTTCATCTT 18 0 

TATAAAATAG TCCGGAAATT TCAGAGAAAG ATGAATCTGA TTTTCCAAGA GAGGACAGCT 24 0 

GTGGACTATC TGGCACTGGA GACTAAATAA AGAAAGCAGG TACAGTCAAT AAGATCTTCA 3 00 

GGACATATAC ATTTTGTTTA TTAAGAAAAA GCAAATAAAA CATTTTTCAG AAAAAGGCAA 3 60 

ACATGCTAGA AAGCATATGA CTTAGTCATT TGAGTTTTTA TTATTAAGGA AATTTACAGG ' 42 0 

CCCAAGAAAC ACCTTGCTCA ATATATTAAA TTTTATTTTG GTTTTCAACT AGACTTTGCT 480 

TTTCATTTGT TTGTTTTTGT GACAAGTTCT CGCTCTGTCA CCTAGGCCAA AGTGT AGTG A 54 0 

CACAATCTTA GCTCACTGTA GCCTCCTAGA TTCAAGTGAT CCTCCTGTCT CAGACTCCTG 600 

AGTAGCTAGG ACTACAGGAA CATTCCACCA TGCCCAGCTA ATTTTGTTTT GTTTTGTTTT 6 60 

GTTTTCAGAG ACAATGTATT GCAGCGTTGC CCAGGCTGAT CTGAAACTCT TAGCCTCAAA 720 

CGATACTCCT GCCTCAGCCT CCCAAAGCAC TAGGATTACA GACATGAGCC AATGCGCCCA 780 

GCCTTAAATT AGACTTTAAA TGTGGTTTTA AACTCCTGTT GAAAAAGCGT CTGGTATCTT 840 

GAACCAGTAG ATGTTTTCAT AGCAATGAAG CTAAACTGTA ATTTAGACAG TAGCCAAATG 900 

CTTGTGAAAT TTTGCTAAAT AATATAATCT TCAAGGGAGC AAATCATGTC CCAAATGCAA 960 

AAGATCAACT GGTGGGGGCA GTAGTAAAAG ACAGGATACT GTGCTCTTTA AAAGGTCAGT 1020 

AACTATAGTA CCTAGTTATC TTACTTATCA C AG C AAAATA ATTACATAAA ATCCTATGGA 1080 

TCATAAAGGC ACAGACTCAC TTCTGTCTCT AGATCTCAAG CTACCAAAAA GAAATCTCCC 1140 

AATAGTTTCT TGGAGGCCTA TACTTAGTGA AAAAGCAGCT GGAATCAACA TAGTTCCTCC 1200 

TATGTTGTAG GACAATCCTA GCTCTGGGCA TACGAATACA TTAAATCCCA CTT AT CT ATA 1260 
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GAGCTTTCTT AAAGGGAAGA AATTTGAGTA GTATGTAAAA CAGAATAAAA GATTAAGGCT 132 0 

CCATAGGCAT ACAGCTTACC TCCAATTCTC TTGGCCTCTT GCAATTTCTA TTATCAGGCT 13 8 0 

TTACAAGGTG ATTTGCCATC ATATTCCGAA GGCACCAGCT ACAAAGCTTA GAACAATGCC 14 4 0 

AGATTTAGGT ACAAACTCCA TGCTACAAGC TCTCTGGAAT CCTTCCCTGT TTCCCACTCC 15 00 

TACTGCTGAT GTTAATTTAG ACTGTCATTA TCTGTCACTT TCCTAAACTC AATTTCTCCC 1560 

TCCTCTAAAT CATTCTATCA ACTGCTATTT GGGTAATCTT TCAAAACTTT GATTACTGCA 1620 

TTCCTTTAAC TCAAAAACTT TCATTGTTCC AGAATAAGTT GAAATTCCAT GATATGGCCT 16 80 

TCAAGGTCCT GTATTATCTG GTGCAAGCCT ACTAGTCCCA TCATTTTCAA CTACTCCTCT 174 0 

CTATGTACTT AGCCAAATGA GTCTCTCTGG CAATTGTGCC TTGTTTCAGG ACTGGCTCAG 18 00 

TTAAGATTCT TTTATCTTCG GCCGGGCGCG CTGGCTCACG GCTGTAATCC CAGCACTTTG 18 60 

GGAAGCTGAG GCAGGAAGAT CACCTGAGGT CGGGAGTTCG AGACCAGCCT GG.CCAGCATG 192 0 

GTGAAACCCT GTGTCTACTA AAAATCCAAA CATTAGCCAG GCGTGGTGGC AGGCGCCTGT 198 0 

AATCCCAGCT ACTTGGGAAG CTGAGGTGAG AGAATCGCTT GAACCCAGGA GAGGGAGGTT 2 04 0 

GCAGTGAGCC GAGATTGTGC CATTGCACTC CAGCCTGGGC AACAGAGCGA GACTCCACCT 2100 

CAAAAAAAAA AAGGATTCTT CTATCTTCAC AAAATCTTAA TGTTTAAACA GGTCTTACAG 2160 

TTCATCTAAT TCAATCTCAT TTTTTACAAG TGAGAAAACA GGGACAGTGA CGGTGGATCA 222 0 

AGTGACACCA GTAAGACTGA GCTAAATTAG AACCGAGATC TCACTCGAGT CTGAGGTTAT 22 8 0 

TCCCACTGTC CAACCTTACT TTAAAGTAGC TTCAAATTTT ACTTTTAGTT TTCCATAAAT 2 34 0 

TCGGAAGGGA TTTTCCCTAG GAGTCCAAAT GTTGAAACCT GGAAGGGTAT AGTCTCTGTG 24 00 

TCTTTGAGAT GAGGGGAGCC CTGTCCATAT TCAAGTTATC AATTGACTTT GTTGTTTTTG 246 0 

AGAAACGATG CTGATTTGGG TAACTTTAAC ACATCTGTTT GATTAGTCCT ATAAAATATG 2 52 0 

CATATATAGA AGACAGAAAG AGCAACAACA AATTTGAAAG ATGCTTGTTA AGTAAATTCT 2 58 0 

GTATCGTACG TGTCCATTCC TGCCAGTACC TTTATAGTAT GTAAGTTTAC GTGCTGTAAT 264 0 

AGTATTAATA GTATCTAGAA AATACTACAC ATGCACAGCA GTGCTAACTT TGCCTTGGGA 2 700 

GTTGGAAAAT ACTTCAGAGA AGCCAACAGG CAGATTTTTC TCTCTTCCCT TCCCCTTCTA 2 760 

ATTTTCCCTT TCCCCTTCAC CCCCTTCTCT TCTCTCCCCA AGTAACACTG TGCACCTATG 2820 

TCAAACGAAA ACTTATAATC AAGTAACTGT TTCTGCAAAA ATAAGTTCGT TTTGCTGTCA 2 88 0 

TGGCTCAAGG CCTCAGCAGA TCCAGGCCTG GTGGACGGGC TGGTC TTCGT CGTGTGCCAA 2 94 0 

ACACTGACCA CTGCCCTGGC TCTGCCATCT TAGGCTTAGT GACCTGGCTG TTACTAAGCA 3 00 0 

CTGTCCCCTC TGCCCCATGC AGCTGTCTCC TTCTAGTCTT CTCCCTCTTC TCAACGCGAT 3 060 

CCTAGCCCCT CAGGCCATTT CACCTCCATT TTCCCTCACT TCCCGCCGCC CCTCCGCACT 3120 

TCCTCCCTAC TGTTGTTTCC GCCCCACTAG . AGCCCCTCAG AGAAAGTTTC CATCCTCGCA 318 0 

CCCTTCCTTG TGTCACAGCC CGTCACATTC TCACAGGCGC CCATCCCTCC AGCCCCACCC 324 0 

CAAGGCCAAT GTACTTCGCG GTATGGGGAC CTTCCTCGTC AGCGAACGCG AGGGAGTGAA 3 300 

GACCCTGGGC GCGGGGTGCT CGGACTTCGG GGGTGGAGGT GGGAAGCGCG CCGCACTCCC 336 0 

AGCAGCCCCT GCACGAGTCA CGTGACAGCT CTCCCACCAC CACCCCCCCC AACTTCCCCA '3420 

CCGTAGCCTC CCAGAGCCAG GCCCCACGGA AAGGCAGCTT TTTCCCGGTT TTCTCCCGCT 348 0 

CTTTCCCCTC CACTTGGAAT ACTCGTGAAA CAAAAATCTC TCCCTGCCAC CCTGTGTGTG 3 54 0 

TTTGAACCAG GAAAAAATCT GAAACTGGTC AAGAAAGAAC AAGGAAGACT TGCCAAAGCA 3 600 

AGGCCGGTGT GTGTCCCAGC AGCTTAGAAT CTCAGCAAAG GAACACAAAA TAGCACATCC 3 660 

ACGGCCTCTT TTCGAGTAAA ATTTACTTGG TTTGTTTGCA GGAAGGGTTT AAAACTGCGT 3720 

TTGCAGATGC TCTGTTTGCA GGAAGGCTTT AATCACGTGT TCCCCTGGCC CACAAGCAAG 3 780. 

GCTTTTAGAT CCAGAGCCTC AGTTACTGCC CCCTCTTCCT CTTTGGTGCA ACCAAACGTT 3 84 0 

CAGAATCACG CCTTCTTAGA AAATTCTTAC CCCGGGTGTG TCAATAAGTT AAGTCTAATT 3 900 

GGCAACAGCT ATCAAAAAGT GTTGCATAAC ACACATGGCT CACATAATTG TAGCTTTGCC 3 960 

TCATCGGGTG TTTTAATGCG GAGGCTTTGA CCTGCAATTT CAAAGATATA CATTCCAAGC 4 02 0 

TTACGCCCAG TTAGTGGATG TGGAAGAAAA AAAAAAGCAA ATTACCTCAT AACACAAAGG 4 080 

TCAATAACAC ACATCCATAA GCTCCAGGTA CAAAATCTTA CATCTTAGAG AACTATATTT 414 0 

AACATTTACA TACATTAGTA AGGTTTTTTT TTTCCTTTTG CTTGATTAAA TGTTAGTTAT 42 00 

CATTAAGTCT TGGAATTATT CTGTGTGTGT ATATTTATTT GCTGTTTGTG AAGAAGCCGG 4 26 0 

TTGTTTTAAA TAAGTTCCTA GAAAATAAGC GCTCAATGTG TTTAATCTGA GTTGCTAATA 4 32 0 

TTGTGAAATA TAGGCCACAT AATACTAGCC TAGATAACTA TGGCGAAGTA AGGAGTCTCA 4 380 

AACACTGTCC CAGAACAATA GCAATCTGTG TTGAATTTTT ACCCTCTGTG GTAAAATGAA 4 44 0 

GGGAAAAGGA ATGAAGTTTT AGTTTGCCTT AATTTTTATC TTTATTGTTT CAGACTCTTC 4 500 

AGCAGTATAA AGTTTTCATC AAGTCAAATA TATTCACTTT AAAGTGACTG TGCTTTATTC 4 56 0 

TGATACCATG TCCTTCCTAA TTTGGGGGGC CAGGTGAGAT AAGTTTTATG AAATAAAAAG 4 62 0 
ATTAAAAATT CTTACATTTT TAGTGTCCTT CCTTGGTAAA ATGTAGAGTT GTCCACTGTG 4 68 0 
TTTATCTCCT CCTCCTTATT ATCATGGTTG CTGTTATTAT TTTTAATGGT TCATTAAACC 4 740 
CAAGGGTCTG GGAAATACTC ATGGAATTCA TCTCACAGCC TTCACACTGT ATGATATTTA 4 8 00 
AACAGGTGGT TGTCCATCTG ATTCTTAAAA TATTTCCAAG AAAAATGATT CCACCTAATG 4 860 
CATAAATGCT TT CATC AG AT TAAGAGAACA CCATGGACAT TTTATTTTAT TTTATTTTTT 4 920 
AAATATTAAC TTCCATTGCA TAAGCTAAAT GGGTAGGAAT AAGTGAGATG ATATTGTTAT 4 980 
CTAGAGCTTT AAAATATTCA AAGGGCTGTC ATCATTATCT CATTTAATCT TTGAAAACAA 504 0 
CTCTATGAAG TACAAAGGAC ACTGAGACAT TTGTTGCTCT ATATCAAAGA AAAAAGTGTT 5100 
TGTCCCAAAA CTTCAAAATG TGTAAATTAC ACATTCTGCA TCTTTACAGC TGGAGAAAAT 5160 
TCACTGGCAA TGGAATATTT AAAATTAGAG CTTGCTTAGT GTGCTGCTTC TGATCACTAC 522 0 
TTGATCCCAC TTCGTGCTTT CATGTTAATT GGCCCAATTG GACTCTACAG TTGGAAGGTG 5280 
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AAAACTTACT ATTTCAACTT GAGTCACGTA TGTATTCTTA TCATATACTT CTTAAAGGTA 534 0 

CTATTTTTTT TCTTCTGATA GTCACCACAC CAAGCACTTC CAGCCACCCT GCCACAGACT 54 00 

TCCTTTGTAA TCACTGTTGA AGGACATGAT GTTTTTATGA CTTCCCGAAA TGAAAACCCT 54 6 0 

ATCTTGTTTT TAAAACAAAC AAACCAACAA AAAGTAGTGT TTATGTAAGC ATTTTGTTCC 5 52 0 

CTGACTCTAG GAACCCCTCT GTTTTTATAT CAACTCTGTA CTGGGAAAAC ACAAAAACAA 5 58 0 

AATGCCACCT TGCTAATTCC CTTCCTAGCA AAGTAATACA GTTTAGCACA TGTTCAAGAA 56 4 0 

AAAAATGGCT AAGAAATTTT GTTTGCACTA ATTATTTTCA AGACTGTGAT ATTTACACTC 5700 

TGCTCTTCAA ACGTTAC ATT TTAT AAG ACT ATTTTTTAAC ATGTTGAACA TAAGCCCTAA 576 0 

ATATATGTAT CCTTAAATTG TATTTCAAAT ATTTTAGGTC AGTCTTTGCT ATCATTCCAG 5820 

GAATAGAAAG TTTTAACACT GGAAACTGCA AGTAAATATT TGCCCTCTTA CCTGAATTTT 5880 

GGTAGCCCTC TCCCCAAGCT TACTTTCTGT TGCAGAAAGT GTAAAAATTA TTACATAAAA 5 94 0 

TTCTAATGAT GGTATCCGTG TGGCTTGCAT CTGATACAGC AGATAAAGAA GTTTTATGAA 6 000 

AATGGACTCC TGTTCCACTG AAAAGTAAAT CTTAATGGCC TGTATCAACT ATCCTTTGAC 6 06 0 

AC C AT ATTG A GCTTGGGAGG AAGGGGAAGT CCTGAATGAG GTTATAAAGT AAAAGAAAAT 612 0 

ATTTGCAAAA TGTTCCTTTT TTTAAAATGT TACATTTTAG AAATATTTTA AGTGTTGTAA 618 0 

CATTGTAGGA ATTACCCCAA TAGGACTGAT TATTCCGCAT TGTAAAATAA GAAAAAGTTT 624 0 

TGTGCTGAAG TGTGACCAGG AAGTCTGAAA . ATGAAGAGAG ACAGATGACA AAAGAAGATG 63 00 

CTTCTAATGG ACTAAGGAGG TGCTTTCTTA AAGTCAGAAA GAGATACTCA GAAAGAGGTA 636 0 

CAGGTTTTGG AAGGCACAGA GCCCCAACTT TTACGGAAGA AAAGATTTCA TGAAAATAGT 642 0 

GATATTACAT TAAAAGAAGT ACTCGTATCC TCTGCCACTT TATTTCGACT TCCATTGCCC 64 8 0 

TAGGAAAGAG CCTGTTTGAA GGCGGGCCCA AGGAGTGCCG ACAGCAGTCT CCTCCCTCCA 6 54 0 

CCTTCTTCCT CATTCTCTCC CCAGCTTGCT GAGCCCTTTG CTCCCCTGGC GACTGCCTGG 6600 

ACAGTCAGCA AGGAATTGTC TCCCAGTGCA TTTTGCCCTC CTGGCTGCCA ACTCTGGCTG 666 0 

CTAAAGCGGC TGCCACCTGC TGCAGTCTAC ACAGCTTCGG GAAGAGGAAA GGAACCTCAG 6 720 

ACCTTCCAGA TCGCTTCCTC TCGCAACAAA CTATTTGTCG CAGGTAAGAA ATATCATTCC 67 8 0 

TCTTTATTTG GAAAGTCAGC CATGGCAATT AGAGGTAAAT AAG CT AG AAA GCAATTGAGA 6 84 0 

GGAATATAAA C CATC TAG C A TCACTACGAT GAGCAGTCAG TATCAACATA AGAAATATAA 6 9 00 

GCAAAGTCAG AGTAGAATTT TTTTCTTTTA TCAGATATGG GAGAGTATCA CTTTAGAGGA 696 0 

GAGGTTCTCA AACTTTTTGC TCTCATGTTC CCTTTACACT AAGCACATCA CATGTTAGCA 702 0 

TAAGTAACAT TTTTAATTAA AAATAACTAT GTACTTTTTT AACAACAAAA AAAAGCATAA 70 8 0 

AGAGTGACAC TTTTTTATTT TTACAAGTGT TTTAACTGGT TTAATAGAAG C CAT AT AG AT . 714 0 

CTGCTGGATT CTCATCTGCT TTGCATTCAG ACTACTGCAA T ATTG C AC AG AATGCAGCCT 72 00 

CTGGTAAACT CTGTTGTACA CTCATGAGAG AATGGGTGAA AAAGACAAAT TACGTCTTAG 72 60 

AATTATTAGA AATAGCTTTC ACTTTAGGAA CTCCCTGAGA ATTGCTGCTT TAGAGTGGTA 73 2 0 

AGATAAATAA GCTTCTCTTT AAACGGAATC T C AAG AC AGA ATCAGTTACA TTAAAAGCAA 73 8 0 

ACAAAAAATT TGCCCATGGT TAGTCATCTT GTGAAATCTG CCACACCTTT GGACTGGGCT 744 0 

ACAATTGGAT AATATAGCAT TCCCCGAGAT AATTTTCTCT CACAATTAAG GAAAGGGCTG 7 500 

AATAAATATC TCTGTTTGAA GTTGAATAAC AAAAATTAGG ACCCCCTAAA TTTTAGGGCT 756 0 

CCTGAAATTC GTCTTTTTGC CTATATTCAG CTACTTTACG TTCTATTAAA TCTTCTTTCA 762 0 

GGCCAGGTGC ACTAGCTCAT GCCTAGAATC TCAGGCAGGC CTGAGCCCAG GAATTTGAGA 768 0 

CCAGCCAGGG CAACACAGTC TCTACAAAAA AATAAAAAAT TACCTGGGTG TGTTGGTGCA 774 0 

TGCCTGTAGA ACTACTCAGG ATGCTGAGGA CTGCTTGAGC CCAGGATAGC CAAATCTGTG 7800 

GTGAGTTCAG CCACTAAACA GAGCGAGACT TTCTCAAAAA AACAAACAAA AAAACAAACA 7860 

AACTTCCTTC AAAATAACTT TTTATCTGCA ATGTTTTCCT ATTGCCTGTG AGATTAAATT 7 92 0 

TACTCTTTTA CCTGATTTCC AAAGCCCTCC ATAATCTAAT CCGACTTTAC CTTGTGTTCA 7 98 0 

CTGCAAAATA GCAGGACTGT TCCACTACAA TCCAAAAATC ACAGGTTGGG TGCAGTGGCT 8 04 0 

CACTCCTGTA ATCCCAACAC TTTGGAAGGC CAAGGCAGGT GGATTGCTTC AGCTCAGGAG 8100 

TTCAAGACCA GCCTGGGCAA CATGGCAAAA ACCCTGTCTC TCCAAAACAT ACAAAAATTA 816 0 

GCCAGATGTG GTAGTATGTG CCTGTAGTCC CAACTACTCA AAAGGCTAAG G C AAG AGG AT 822 0 

CACTTGAGCC CAGGAGGTCA AGGCTACAGT GAGCCATGTT TACTGTGTCA CTGCACTCCA 8280 

GCCTGGGTGA TAGAGCAAGA CCATGTCTCA AAAAAAAAAA AAAGAAAAGA AAAGAAAAAA 834 0 

ACATCGCTCT ATTCAGTTCA CCCCCACCAC AACATTGTTT TGATTATCAC ATAAATGCTG 8400 

GTCCATTGCC TTCTCTATCT ATTCAAATCT TTAAGCATTC TTTGAGATTC AACTCAATTC 84 60 

TCCTTTTCAA ACTAGGCCAT TTAAACTACA TCAGTTCCAT TTTGATTTTC TTGCTTTGAG 8520 

T C T AC AG AC T CAAAAACAAA AACTTAAAAA CTTATTTTTT AAGTTTTCTG CTACTCTCAC 8 58 0 

TTCTTCAACA CTCACATACA CGCATTCATA ATAAGATGGC AGAATGTTCA AGGATAAAAT 8640 

GATTTATAGA ACTGAAAAGT TAGGTTTTGA TCTTGTTGCT GTCAAGATGA CTACCTACCT 8700 

GATCTCAGGT AATTAATTAT GTAGCATGCT CCCTCATTTC ATCCCATACC TATTCAACAG 8760 

GATTGGAATT CCACAGCAAG GATAAACATA ATCATAGTTG CTTTTCAAGT TCAAGGCATT 8820 

TTAACTTTTA ATCTAGTAGT ATGTTTGTTG TTGTTGTTGT TGTTTGAGAT GGAGCCCTGC 8 8 80 

TGTGTCACCC AGGCTGGAGT GCAGTGGCAC GAACTCGGCT CACTGCAACC TCTGCCTCAT 8940 

GGGTTCAATC AGTTATTCTG CCTCAGTGTC CCAAGTAGCT GGGACTACAA GGCACATGCC 9000 

ACCATGCCTG GCTAATTTTT GTATTTTTAG TAGAAACAGG GCTTCACCAT GTTGGCCAGG 9060 

CTGGTCTCGA ACTCCTGACC TCAAGTGATC CAGCCGCCTC GGCCTCCCAA AGTGCTGGGA 9120 

TTACAGGCAT AAGCCACCGT GCCCAGCCTA ATAGTATGTT TTTAAACTCT TAGTGGCTTA 9180 

ACAATGCTGG TTGTATAATA AATATGCCAT AAATATTTAC TGTCTTAGAA TTATGAAGAA 924 0 

GTGGTTACTA GGCCGTTTGC CACATATCAA TGGTTCTCTC CTTACAGCTT TAATTAGAGT 93 00 
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CTAGAATTGC AGGTTGGTAG AGCTGGAACA GACCTTAAAG ATTGACTAGC CAACTTCCTT 93 6 0 
GTCCAAATGA GGGAACTGAG ACCCTTAAAA TTAAGTGACT TGCCCCAGAC AAAACTGGAA 94 20 
CTCATGTGTC CTAATTTCCA TCATGAAATT CTACCATTCA CTAGCCTCTG GCTAGTTGTC 94 80 
AAAGTATTGC ATAACTAAAT TTTTATGTCT GTTTTAAAGA ACAAATTGTC ACTGCTTACT 9 54 0 
CCTGGGAGGG TCTTTCTGAG GTGGTTTATA ACTCTTAAAA AAAAAAAAGT CAGTAGTCTG 9600 
AGAATTTTAG ACGAAATAGT CAAAGCATTT TTATCCAATG GATCTATAAT TTTCATAGAT 96 60 
TAGAGTTAAA TCAAAGAAAC ACGGATGAGA AAGGAAGAGG AAAATTGAGG AGAGGAGGAA 9720 
TGGGGATGAG AACACACTAC TTGTAATCAG TCATAGATGT ACTGAGAACT AACAAGAAGA 9780 
ATTGTAAGAA AATAAGAATG AAGAATTCAA AATCAACACA TGAAATAAAA AGAAACTACT 984 0 
AGGGAAAAAT GGAGAAGACA TTAGAAAAAT TATTCTATTT TTAAAATTCT GTTTTCAGGC 9 900 
TTCCCTCCTG TTCTTCCTCC TTCTCATTGG TTTTCAGGTG GAGGGAAAGT TTAAGATGGA 9960 
AAAAATATAT ATATTCTACA CATCCCTTTC TACGCTGTTG TCATGGCAAC AAGGTTTATC 10 020 
ATAGCAAACT TTTATTCATA CAACATTTAT TGAGTTCTTA CTGTGTGGTA AGCTCTTTCC 10 080 
AGGTGTTGAA . AATTC AGGGG AAAAAAGACA ACTCATTGTC TTAAAACTCA GATGAAAGCT 1014 0 
GAACAGACCT ATTTTTAATC AAAGTAATCT CAATTTAGGG TAGTAAGAGC TATTTAAGAA 102 00 
GCATGAACAG GTGTGAAGGA GGTAGGACTC TGAGGAGAGA ATAGTTAGCT AGGAATGAAA 102 6 0 
GAGCAGAGAA GTTTTCCTAG AGGAACTATT AAAGCTGGGA GTTACGGGAT GAAAGATGAG 10 320 
GCAGGGTTTG GAGGCAAAAA AAAAAAAAAG GCAGGGGAAG GGGAAGTTCT GGCCTGGCAG 10380 
AGAGAATAAC TGTGGCAACA ATGGAGGAGA GTCTGGAAGC AAGAAAACCA AGTAGAAGAG 1044 0 
TATTAAAATA GAAGATGCCA GGGGTAATGA GGGCTTGATT TAAAACAGTG_ CTGTTGGAGA 10500 
TGGAGAGGAG ATACCAAATT CTGGAGACAT TTCTGAGTTA GAACCTACAG TATTTATCAG 10 56 0 
ACAAGGGAAA GATTAGACAA AGGAGTTAAG AATGACTCCC AGGTTTCAGT TTGGGGCAGG 10620 
TAACTAGGAC ATGTTTTGAA AAGTAATGTA TTGGATCTCT T AC C ATTGG A ACTATGTATG 1068 0 
TGGAGCCAAA TTAAAATTTG TACATGTATA TAACTCTCCC CCCACCACCA GTAACTACTT 10740 
CCCTAACTCT CTACTTTGTA GCCAGACTTC CTAAAAGAAT AGTTTGTAGT CACTGTCTTT 108 00 
ACTTTTCCCC TCCCATTCTG T C CT AG AT AT TTGTCCACCT ACCATCTGCT GCCTCCACTT 1086 0 
TACCCAAACT GTTCTACGGT TGCCCAAAAC TTCCTAATTG CCAAATTCAA TGAACAAGTT 10 92 0 
TAAGCTTATA TGTAAATTAG GAGCTCTACA GTTTGATTTC GAGCAGCCCC TCCTGAAACC 10 98 0 
CTTTCTCTTT CGACTTCTGT GACACATCTC AGATTTACAA AACTGAACTA ATTATTTTAC 11040 
ACTTGAGCTG TATTTTCGTT CTTCTTTCTT GATGAATGAG GTAACCACTC AACAAATTGC 11100 
CCAAGCCAAA AACTACGAAG TCATCCTCAG TTCCTCCTTC TTCTGTTTGA CCCACAACAG 11160 
ATCAGCTGAG AAATCCCGCT GTTTAGTATC TCTTGAATTC ATTACCTTAA TTTATAGCCT 11220 
CATCAACTCT TAATTGTTAA AATTACTTCA GTAGTTGTTG TCTGACCTCT GTCCAATCTT 1128 0 
GTTCAATCAG GTCCATTCTT TTGTTCTTGG TGGTGGTGGT GGTGTTGACA GAGTTTCGCT 11340 
TTTGCTGCCC AGGCTGAAGT GCAGTGGAGC ACTTCACTGC AACCACAGCC TCCTGGGTTT 11400 
AAGCAGTTCA CCCTCCCGAG TAGCTGGGAC TACAGGTATG TGCCACCACA CCCAGCTAAT 11460 
TTTGTGTTTT CAGTAGAGAC A(?GGTTTCAC CATGTTGGTC AGGCTGGTCT CAAACTCCTG 11520 
ACCTCAAGCA ATCCACCCAC CTCAGCCTCC CAAAGTGCTG GGATTACAGG CATGAGCCAC 1158 0 
TGCACACGGA CCAGATCCAT TGTTTATGTT GCTTCTAGAG TGAGTTTTTA' AAACACAAAT 1164 0 
TTGACCATAT CTTTCTCCAA TTTAAGTCAG TATTTTTTTT TTCAGGAAAA AACAGTTCAA 11700 
ACTCTTTAGT CTGCTTACAC AAGGCCTTTG TAGTCTGACT CTTCTTTCCA AGCTTTCATC 117 60 
AAAGTATACT GCAAGTTACA TTTTATGTGA ATTGAATTAG GCAACGGTAT AAAAATTATA 11820 
GTTTATATGG GCAAAATGGA AATAATGTTA ACTCTTCCAA ATAGTTTATC TAGAATGACA 1188 0 
TAATTTCAAA GCTGTCAGGT CAAATGAGTT ATAAACTGTT AACACTATTG CCACATGCAA 11940 
GTGTCTCTTA TACTTGGTAG AATT ATCTGC TTCCATGTCA TTATTATGTA AATTAGACTT 12 000 
TAAATAACTC AGAAGTTCTT CAGACATACA GGTTATTATT GTGCTTTTTA AACATAATTT 12 060 
TAAATAATTT TATATATGAT AATGTTATCC AAGTGCTAAG GGATGTATTG TTACTGCTGT 1212 0 
GCAAAAAAAA AAAAAAAAAA AACTCCAAAT AAATATGTTG AAAC C AAGTT TATATGCAAG 12180 
AAAACAATAT TAAAAAGGCC AAAGTACCAC CATAATAGGC TGTGTGGAGA CGGCAGGCTA 12 24 0 
CAAAACACTA GTAATAATGC TGAGAAAGTT GAAAAAAGAA AGAAAGCAAC AATATGCTTT 12 300 
GGTTGTTGTA GGTTTATGTA CTCCAAGAAT ATCTCCTCTC AAACTTTTAC GTTTTTTCCA 12 36 0 
AAGAAAAGTT AACTTTGGCT GGGCGCAGTG GCTCTTGCCT GTAGTCCCAG CCTTTGGGAG 1242 0 
GCCAAGGGGG GCAGATCACC TGAGGTCAGG AGTTTGAGAC CAGCCTGACC AAAAATGGAG 12 480 
AAACCCGCCC CCCTCACTAC TAAAAGAATA CAAAATTAGG CCGGGCACAG TGGCTTACCC 12 54 0 
CTGTGATCCC AGCACTTTGG GAGGCCGAAG CAGGAAGATC ACCTGAGGTC AGGAGTTCGA 126 00 
GACCAGCCAT GGAGAAACCC GTCTCTACTA AAAATACAAA ATTAGCCGGG CGTGGTGGTG 12660 
CATGACTGTA ATCCCAGCTA CTCAGGAGGC TAAGGCAGAG AATCACTTGA ACCCAGGCAG 12720 
TGGAGGTTGC AGTGAGCCGA GATCGTGCCA TTGCACTCCA GCCTGGGCAA CAAGAGCGAA 12 780 
ACTCTGTATC CAAAAAACAA AAGAAAAGAA AAGGTAACCT TGAACTATGT GAGATCTTTA 12 84 0 
GAAATGCATT CTTTCTGTAA AATGTGACTA CATTTGCCTT ATTTATGGTA AAAATGTTGA 12 90 0 
GGCCTCAAAC AACCCATATT TTCTCGGTCT CCCCGCTGCC TAGCCTTTGT TCACATTGCT 12 960 
TCTTCTTGGT GGAAGCTCTT CCTCTGGCCT TGAAAATGCC TGCTTCTCTT TCAAGGTAGC 13 02 0 
ACAGTCATCA CTTTCTGTGG TAACCTTCTC CAGCACCATC AAAC AG AAAG AATGAATCTC 13 080 
TTGTAAATTC AGCTCTTACG TCATTCATTA CATTATTTTG TAACTCTTTA TAGATTCTTC 1314 0 
TCTCCCACTA GACTCTGAGT CACTGGAGAG TAGGAGCCAA CTCTCATTCA TGTGTGGTTT 13200 
GGTCAGCTAC TGGCCACATT CCTGATGCAT AGTTAATGCT CAAACCTTAA CTGGTGAATC 13260 
AGCTCAAATA TTGTCCTTCT C T AAAT C CAT TCACTCATTG ACTAACTATG TACTCAAAAT 13 32 0 
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AGTAAACACC 
TTAGTTTGAT 
ATCATAATTC 
CCTTCTCAGT 
CAGCAATGAA 
ATTATTGTTC . 
TTATCTTAGT 
CCCTCCTTTA 
TATAAAATTT 
TAAATGTTTA 
GGCATGGTGG 
TGAGCTCAGG 
TATGAAAACT 
TAAGATGG.GA 
GAGATCGTGC 
AAATGTTATC 
TAAACCTAGA 
AG C ACTTTGG 
TTGGGGTTAG 
CTACCATACA 
ACTCTCCTGC 
TACCAACACA 
GTAATGAGTT 
GTATTTCTAT 
AACTAATCAT 
TCCCCATCCA 
GAAGCCTGCA 
TTTTTTTCAA 
_ GTTAAAAATA 
AGGGCCACCA 
AAGCACTTTT 
GCCTCTCACC 
GGGCACGATT 
ATTTATATTT 
ACCCTTTATA 
GTAAGAGAAC 
CAAGATTGTT 
AGTAGTAGTA 



AGTAATTTAA 

AAATATGTGT 

ACAACTGATA 

TAATGCAGTC 

GCCATAAAGG 

TGATTTTTAA 

GATCTTTTCA 

AAAAACTAGG 

AATATTTATA 

TTTAAATTCA 

CTGACACCTA 

AGTTTGAGAC 

TATCTGGGTG 

GAATCGCTTG 

CACTGCACTC 

TAAATAAGAT 

TAAAACTATC 

GAGGCCAAAA 

CTTAATTCAG 

TGCTAGGCCT 

ATACCATTTA 

TTAAATAATA 

TAAACCATGT 

TAATGACAGA 

GGGAGGAATG 

AGTTCACAGA 

CACCGTATTG 

ATCCTGTATC 

TAGTATAATC 

AATGCCAGAG 

TTATCATTGT 

CATAGGAACC 

AGATGTGTTT 

CTCAAGGAAT 

ATATTGTTTA 

ACTTAAACAT 

CATCAGTATA 

AAAGGTACAA 



TCCAATTCCT 
GTATTACATA 
ACTAATCAAA 
ATTAACAAAC 
TGAATAAAGA 
TATTAATGTT 
TCCTTTGCTA 
TGATAATTCT 
ATATTTAAAT 
TTTGTACATC 
TAATCCCAGA 
CACCCTGGGC 
TGGTGGCACG 
AACCCAGGTG 
CAACCTGGGT 
AAATTTAATA 
AAATAAGGCC 
TTATACAAAG 
ATTAATTTTT 
CATATTATGC 
ATTTTTAAAC 
AGATCTACTG 
CTCAAGATCT 
GTCACAAGTA 
CTAAATTTCA 
TTTCTCACAC 
GAAGAAGGGC 
CCTTGATTTT 
GAGACTCCAG 
CCCTTCAGCC 
TTCACCTTTA 
CACTGGTTGT 
ATGGCATCCT 
ACTAGAATAA 
ATGAGAAAAT 
GGGTATTCAT 
CTAGGTGATT 
TCATTGTCTC 



GCCCATACTG 
ATATTAAAGT 
CATAAATGCT 
ACCTTCTGAT 
ACATGCCCTC 
GGTTTGGGTT 
TATTATTTTT 
AAATTGTAAA 
GTTTGATAAA 
AGTTTTTATT 
ACTTTGAGAG 
AACGTGGTGA 
CATCTGTGGT 
AGAGGGGTGG 
GACAGAGTGA 
ACTGTTCGCA 
TGGGTACAGT 
TTAGTTGTAT 
TTTAAACTGA 
TAGAAAAATT 
AAATTTTAAT 
TGAGGACTAA 
CTGCAATAAC 
CTACTAATAA 
GAGGTTGGTG 
TGAGAACTCC 
AGAAAGGAAA 
ACAGCAAGAT 
AT C AAAAATC 
TTCTCCCACC 
GCATTTTGAC 
AAGAGAAGGA 
CAGGTGAAAC 
TGATTCAGTT 
GCTTTCTATC 
AAGCTGAAAC 
AACTGACCAC 
TTAACAGATG 



CTTGGTACAT 
ATGTACAGAA 
CTCAGGTTAA 
GCTGATAATA 
GTGGAGCTCA 
TTGGTGAAAA 
CTCTAAGAGT 
TTTAAATATT 
TATTTAAATT 
TTATTTAAAT 
GCCAAGTCAG 
AACCCTGTCT 
CCCAGATGGG 
GGTGGATGTT 
GACTCCATCT 
CTTAGATGAG 
GACTCATGCC 
AACACCAACT 
GTTTTAAATT 
TTGAGCACAG 
G C AGT AT AT A 
ATTTCTGTAA 
TGTAGCACAA 
TAGTGTGGTT 
AAAATACATG 
TATTCCATAA 
AG C AAATGG A 
TGTATTTATG 
ACCGCAGCTC 
CTGCCTGTAC 
AATGAAGTCA 
TGAAGCCAGT 
TATATTTATA 
CAGTACTAGG 
TTCCAAATAT 
TTCTGGCATT 
TGAACTTGAA 
GCTCTTTGCT 



TTCAGGTGAA 

GATCATGCTA 

CAAATGTCTG 

GGGCCTTGTT 

CAGCCTAGTC 

ATGTTTAGAC 

CTTCCTTATC 

ATAAATAGCT 

TTATAATATT 

GTGTTGGCCA 

GCAAACCATT 

CTACCAAACA 

AGTCCCAGGC 

GCAGTGAGCT 

CAAAAAAAAA 

CATAAGGAAC 

TGTAATCTCA 

AACAACTATT 

CCTGCTTACT 

ATTTATGAAT 

TGTGCCTTTT 

TTTCAAAGTA 

CAGAAAATAG 

TGTTTCCTGC 

TGTATTTTTT 

CAAAATTCTG 

AGGATTTAAA 

TATTACTTGT 

AGGGAGAAAG 

CCTCAGATGG 

CAAACCTTCA 

CCTTCCTAAA 

TTGACAATAT 

CCATTTATCT 

CTGATGATTT 

TATTG AATGT 

GGTAGTATAA 

TTCATTAGGA 



13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
. 15360 
15420 
15480 
15540 
15600 



ATAAAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA 15651 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala 
-35 -30 -25 

ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G GTAAGGC TAATGCCATA 15702 
Met Lys Phe lie Asp Asn Thr Leu Tyr Phe lie Ala 



-20 



-15 



-10 



GAACAAATAC CAGGTTCAGA TAAATCTATT CAATTAGAAA AGATGTTGTG AGGTGAACTA 
TTAAGTGACT CTTTGTGTCA CCAAATTTCA CTGTAATATT AATGGCTCTT AAAAAAATAG 
TGGACCTCTA GAAATTAACC ACAACATGTC CAAGGTCTCA GCACCTTGTC ACACCACGTG 
TCCTGGCACT TTAATCAGCA GTAGCTCACT CTCCAGTTGG CAGTAAGTGC AC AT C ATG AA 
AATCCCAGTT TTCATGGGAA AATCCCAGTT TTCATTGGAT TTCCATGGGA AAAATCCCAG 
TACAAAACTG GGTGCATTCA GGAAATACAA TTTCCCAAAG CAAATTGGCA AATTATGTAA 
GAGATTCTCT AAATTTAGAG TTCCGTGAAT TACACCATTT TATGTAAATA TGTTTGACAA 
GTAAAAATTG ATTCTTTTTT TTTTTTTCTG TTGCCCAGGC TGGAGTGCAG TGGCACAATC 
TCTGCTCACT GCAACCTCCA CCTCCTGGGT TCAAGCAATT CTCCTGCCTC AGCCTTCTGA 
GTAGCTGGGA CTACAGGTGC ATCCCGCCAT GCCTGGCTAA TTTTTGGGTA TTTTTACTAG 
AGACAGGGTT TTGGCATGTT GTCCAGGCTG GTCTTGGACT CCTGATCTCA GATGATCCTC 
CTGGCTCGGG CTCCCAAAGT GCTGGGATTA C AGG C ATG AA CCACCACACA TGGCCTAAAA 
ATTGATTCTT ATGATTAATC TCCTGTGAAC AATTTGGCTT CATTTGAAAG TTTGCCTTCA 
TTTGAAACCT TCATTTAAAA GCCTGAGCAA CAAAGTGAGA CCCCATCTCT ACAAAAAACT 
GCAAAATATC CTGTGGACAC CTCCTACCTT CTGTGGAGGC TGAAGCAGGA GGATCACTTG 
AGCCTAGGAA TTTGAGCCTG CAGTGAGCTA TGATCCCACC CCTACACTCC AGCCTGCATG 
AC AGT AG AC C CTGACACACA CACACAAAAA AAAACCTTCA TAAAAAATTA TTAGTTGACT 
TTTCTTAGGT GACTTTCCGT TTAAGCAATA AATTTAAAAG TAAAATCTCT AATTTTAGAA 
AATTTATTTT TAGTTACATA TTGAAATTTT TAAACCCTAG GTTTAAGTTT TATGTCTAAA 
TTACCTGAGA AC AC AC T AAG TCTGATAAGC TTCATTTTAT GGGCCTTTTG GATGATTATA 
TAATATTCTG ATGAAAGCCA AG AC AG AC CC TTAAACCATA AAAATAGGAG TTCGAGAAAG 
AGGAGTAGCA AAAGTAAAAG CTAGAATGAG ATTGAATTCT GAGTCGAAAT ACAAAATTTT 



15762 

15822 

1588? 

15942 

16002 

16062 

16122 

16182 

16242 

16302 

16362 

16422 

16482 

16542 

16602 

16662 

16722 

16782 

16842 

16902 

16962 

17022 



21 



ACATATTCTG TTTCTCTCTT TTTCCCCCTC TTAG CT GAA GAT GAT G GTAAAGT 17 07 5 

Ala Glu Asp Asp Glu 
-10 

AGAAATGAAT TTATTTTTCT TTGCAAACTA AGTATCTGCT TG AG AC AC AT CTATCTCACC 1713 5 

ATTGTCAGCT GAGGAAAAAA AAAAATGGTT CTCATGCTAC CAATCTGCCT TCAAAGAAAT 1719 5 

GTGGACTCAG TAGCACAGCT TTGGAATGAA GATGATCATA AG AG AT AC AA AGAAGAACCT 172 55 

CTAGCAAAAG ATGCTTCTCT ATGCCTTAAA AAATTCTCCA GCTCTTAGAA TCTACAAAAT 17 315 

AGACTTTGCC TGTTTCATTG GTCCTAAGAT TAGCATGAAG CCATGGATTC TGTTGTAGGG 17375 

GGAGCGTTGC ATAGGAAAAA GGGATTGAAG CATTAGAATT GTCCAAAATC AGTAACACCT 1743 5 

CCTCTCAGAA ATGCTTTGGG AAGAAGCCTG GAAGGTTCCG GGTTGGTGGT GGGGTGGGGC 174 95 

AGAAAATTCT GGAAGTAGAG G AG AT AGG AA TGGGTGGGGC AAGAAGACCA CATTCAGAGG 17 55 5 

CCAAAAGCTG AAAGAAACCA TGGCATTTAT GATGAATTCA GGGTAATTCA GAATGGAAGT 17615 

AGAGTAGGAG TAGGAGACTG GTGAGAGGAG CTAGAGTGAT AAACAGGGTG TAGAGCAAGA 17675 

CGTTCTCTCA CCCCAAGATG TGAAATTTGG ACTTTATCTT GGAGATAATA GGGTTAATTA 17 735 

AG C AC AAT AT GTATTAGCTA GGGTAAAGAT TAGTTTGTTG TAACAAAGAC ATC CAAAGAT 17 795 

ACAGTAGCTG AATAAGATAG AGAATTTTTC TCTCAAAGAA AGTCTAAGTA GGCAGCTCAG 1785 5 

AAGTAGTATG GCTGGAAGCA ACCTGATGAT ATTGGGACCC CCAACCTTCT TCAGTCTTGT 17 915 

ACCCATCATC CCCTAGTTGT TGATCTCACT CACATAGTTG AAAATCATCA TACTTCCTGG 1797 5 

GTTCATATCC CAGTTATCAA GAAAGGGTCA AG AG AAGT C A GGCTCATTCC TTTCAAAGAC 18 035 

TCTAATTGGA AGTTAAACAC ATCAATCCCC CTCATATTCC ATTGACTAGA ATTTAATCAC 18095 

ATGGCCACAC CAAGTGCAAG GAAATCTGGA AAATATAATC TTTATTCCAG GTAGCCATAT 18155 

GACTCTTTAA AATTCAGAAA TAATATATTT TTAAAATATC ATTCTGGCTT TGGTATAAAG 18215 

AATTGATGGT GTGGGGTGAG GAGGCCAAAA TTAAGGGTTG AGAGCCTATT ATTTTAGTTA 18275 

TTACAAGAAA TGATGGTGTC ATGAATTAAG GTAGACATAG GGGAGTGCTG ATGAGGAGCT 1833 5 

GTGAATGGAT TTTAGAAACA CTTGAGAGAA T C AAT AGG AC ATGATTTAGG GTTGGATTTG 183 95 

GAAAGGAGAA GAAAGTAGAA AAGATGATGC CTACATTTTT CACTTAGGCA ATTTGTACCA 19455 

TTCAGTGAAA TAGGGAACAC AGGAGGAAGA GCAGGTTTTG GTGTATACAA AGAGGAGGAT 18515 

GGATGACGCA TTTCGTTTTG GATCTGAGAT GTCTGTGGAA CGTCCTAGTG GAGATGTCCA 18 57 5 

CAAACTCTTC TACATGTGGT TCTGAGTTCA GGACACAGAT TTGGGCTGGA GATAGAGATA 1863 5 

TTGTAGGCTT ATACATAGAA ATGGCATTTG AATCTATAGA GATAAAAAGA CACATCAGAG 18 6 95 

GAAATGTGTA AAGTGAGAGA GGAAAAGCCA AGTACTGTGC TGGGGGGAAT ACCTACATTT 18755 

AAAGGATGCA GTAGAAAGAA GCTAATAAAC AAC AG AG AG C AGACTAACCA AAAGGGGAGA 18815 

AGAAAAACCA AGAGAATTCC ACCGACTCCC AGGAGAGCAT TTCAAGATTG AGGGGATAGG 188 75 

TGTTGTGTTG AATTTTGCAG CCTTGAGAAT CAAGGGCCAG AACACAGCTT . TTAGATTTAG 18 93 5 

CAACAAGGAG TTTGGTGATC TCAGTGAAAG CAGCTTGATG GTGAAATGGA GGCAGAGGCA 18 995 

GATTGCAATG AGTGAAACAG TGAATGGGAA GTGAAGAAAT GATACAGATA ATTCTTGCTA 1905 5 

AAAGCTTGGC TGTTAAAAGG AGGAGAGAAA CAAGACTAGC TGCAAAGTGA GATTGGGTTG 19115 

ATGGAGCAGT TTTAAATCTC AAAATAAAGA GCTTTGTGCT TTTTTGATTA TGAAAATAAT 19175 

GTGTTAATTG TAACTAATTG AGGCAATGAA AAAAGATAAT AATATGAAAG ATAAAAATAT 1923 5. 

AAAAACCACC CAGAAATAAT GATAGCTACC ATTTTGATAC AATATTTCTA CACTCCTTTC 19295 

TATGTATATA TACAGACACA GAAATGCTTA TATTTTTATT AAAAGGGATT GTACTATACC 193 55 

TAAGCTGCTT TTTCTAGTTA GTGATATATA TGGACATCTC TCCATGGCAA CGAGTAATTG 19415 

CAGTTATATT AAGTTCATGA TATTTCACAA TAAGGGCATA TCTTTGCCCT TTTTATTTAA 19475 

TCAATTCTTA ATTGGTGAAT GTTTGTTTCC AGTTTGTTGT TGTTATTAAC AATGTTCCCA 1953 5 

TAAGCATTCC TGTACACCAA TGTTCACACA TTTGTCTGAT TTTTTCTTCA GG AT AAAACC 19595 

CAGGAGGTAG AATTGCTGGG TTGATAGAAG AGAAAGGATG ATTGCCAAAT TAAAGCTTCA 19655 

GTAGAGGGTA CATGCCGAGC ACAAATGGGA TCAGCCCTAG AT AC C AG AAA TGGCACTTTC 19 715 

TCATTTCCCC TTGGGACAAA AGGGAGAGAG GCAATAACTG TGCTGCCAGA GTTAAATTTG 19775 

TACGTGGAGT AGCAGGAAAT CATTTGCTGA AAATGAAAAC AGAGATGATG TTGTAGAGGT 19835 

CCTGAAGAGA GCAAAGAAAA TTTGAAATTG CGGCTATCAG CTATGGAAGA GAGTGCTGAA 19895 

CTGGAAAACA AAAGAAGTAT TGACAATTGG TATGCTTGTA ATGGCACCGA TTTGAACGCT 19 955 

TGTGCCATTG TTCACCAGCA GCACTCAGCA GCCAAGTTTG GAGTTTTGTA GCAGAAAGAC 20015 

AAATAAGTTA GGGATTTAAT ATCCTGGCCA AATGGTAGAC AAAATGAACT CTGAGATCCA 2 0075 

GCTGCACAGG GAAGGAAGGG AAGACGGGAA GAGGTTAGAT AGGAAATACA AGAGTCAGGA 2013 5 

GACTGGAAGA TGTTGTGATA TTTAAGAACA CATAGAGTTG GAGTAAAAGT GTAAGAAAAC 2 0195 

TAGAAGGGTA AGAGACCGGT CAGAAAGTAG GCTATTTGAA GTTAACACTT CAGAGGCAGA 20255 

GTAGTTCTGA ATGGTAACAA GAAATTGAGT GTGCCTTTGA GAGTAGGTTA AAAAACAATA 2 0315 

GGCAACTTTA TTGTAGCTAC TTCTGGAACA GAAGATTGTC ATTAATAGTT TTAGAAAACT 2 0375 

AAAATATATA GCATACTTAT TTGTCAATTA ACAAAGAAAC TATGTATTTT TAAATGAGAT 20435 

TTAATGTTTA TTGTAG AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA 2 0486 

Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu 
-5 1 5 

TCT AAA TTA TCA GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT 2 0534 
Ser Lys Leu Ser Val lie Arg Asn Leu Asn Asp Gin Val Leu Phe lie 

10 15 20 

GAC CAA GGA AAT CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT 20582 
Asp Gin Gly Asn Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp Cys 
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25 30 35 

AGA G GT ATTTTTTTTA ATTCGCAAAC ATAGAAATGA CTAGCTACTT CTTCCCATTC 20638 
Arg Asp 
40 

TGTT TTACTG CTTACATTGT TCCGTGCTAG TCCCAATCCT CAGATGAAAA GTCACAGGAG 206 98 
TGACAATAAT TTCACTTACA GGAAACTTTA TAAGGCATCC ACGTTTTTTA GTTGGGGTAA 2 0758 
AAAATTGGAT ACAATAAGAC ATTGCTAGGG GTCATGCCTC TCTGAGCCTG CCTTTGAATC 2 0818 
ACCAATCCCT TTATTGTGAT TGCATTAACT GTTTAAAACC TCTATAGTTG GATGCTTAAT 208 78 
CCCTGCTTGT TACAGCTGAA AATGCTGATA GTTTACCAGG TGTGGTGGCA TCTATCTGTA 2 0938 
ATCCTAGCTA CTTGGGAGGC TCAAGCAGGA GGATTGCTTG AGGCCAGGAC TTTGAGGCTG ■ 2 0 998 
TAGTACACTG TGATCGTACC TGTGAATAGC CACTGCACTC CAGCCTGGGT GATATACAGA 210 58 
CCTTGTCTCT AAAATTAAAA AAAAAAAAAA AAAAAACCTT AGGAAAGGAA ATTGATCAAG 21118 . 
TCTACTGTGC CTTCCAAAAC ATGAATTCCA AATATCAAAG TTAGGCTGAG TTGAAGCAGT 21178 
GAATGTGCAT TCTTTAAAAA TACTGAATAC TTACCTTAAC ATATATTTTA AATATTtTAT 212 38 
TTAGCATTTA AAAGTTAAAA ACAATCTTTT AGAATTCATA TCTTTAAAAT ACTCAAAAAA 212 98 
GTTGCAGCGT . GTGTGTTGTA ATACACATTA AACTGTGGGG TTGTTTGTTT GTTTGAGATG 213 58 
CAGTTTCACT CTGTCACCCA GGCTGAAGTG CAGTGCAGTG CAGTGGTGTG ATCTCGGCTC 21418 
ACTACAACCT CCACCTCCCA CGTTCAAGCG ATTCTCATGC CTCAGTCTCC CGAGTAGGTG 214 7 8 
GGATTACAGG CATGCACCAC TTACACCCGG CTAATTTTTG TATTTTTAGT AGAGCTGGGG 21538 
TTTCACCATG TTGGCCAGGC TGGTCTCAAA CCCCTAACCT CAAGTGATCT GCCTGCCTCA 21598 
GCCTCCCAAA CAAACAAACA ACCCCACAGT TTAATATGTG TTACAACACA CATGCTGCAA 21658 
CTTTTATGAG TATTTTAATG ATATAGATTA TAAAAGGTTG TTTTTAACTT TTAAATGCTG 21718 
GGATTACAGG CATGAGCCAC TGTGCCAGGC CTGAACTGTG TTTTTAAAAA TGTCTGACCA 2177 8 
GCTGTACATA GTCTCCTGCA GACTGGCCAA GTCTCAAAGT GGGAACAGGT GTATTAAGGA 218 3 8 
CTATCCTTTG GTTAAATTTC CGCAAATGTT CCTGTGCAAG AATTCTTCTA ACTAGAGTTC 218 98 
TCATTTATTA TATTTATTTC AG AT AAT GCA CCC CGG ACC ATA TTT ATT ATA 21949 

Asp Asn Ala 'Pro Arg Thr lie Phe lie lie 
40 45 
AGT ATG TAT AAA GAT AGC CAG CCT AGA GGT ATG GCT GTA ACT ATC TCT 21997 
Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met Ala Val Thr lie Ser 
5.0 5 5 6 0 * 6 5 

GTG AAG TGT GAG AAA ATT TCA ACT CTC TCC TGT GAG AAC AAA ATT ATT 22045 
Val Lys Cys Glu Lys lie Ser Thr Leu Ser Cys Glu Asn Lys lie lie 

70 75 80 

TCC TTT AAG GTAAGACTG AGCCTTACTT TGTTTTCAAT CATGTTAATA TAATCAATAT 2 2103 
Ser Phe Lys 

AATTAGAAAT ATAACATTAT TTCTAATGTT AATATAAGTA ATG T AAT TAG AAAACTCAAA 2 2163 
TATCCTCAGA CCAACCTTTT GTCTAGAACA GAAATAACAA G AAG CAG AG A ACCATTAAAG 22223 
TGAATACTTA CTAAAAATTA TCAAACTCTT TACCTATTGT GATAATGATG GTTTTTCTGA 22283 
GCCTGTCACA GGGGAAGAGG AGATACAACA CTTGTTTTAT GACCTGCATC TCCTGAACAA 22323 
TCAGTCTTTA TACAAATAAT AATGTAGAAT ACATATGTGA GTTATACATT TAAGAATAAC 22403 
ATGTGACTTT CCAGAATGAG TTCTGCTATG AAGAATGAAG CTAATTATCC TTCTATATTT 22463 
CTACACCTTT GTAAATTATG ATAATATTTT AAT C CCT AGT TGTTTTGTTG CTGATCCTTA 22 523 
GCCTAAGTCT TAGACACAAG CTTCAGCTTC CAGTTGATGT ATGTTATTTT TAATGTTAAT 2 2 583 
CTAATTGAAT AAAAGTTATG AG ATC AGC TG TAAAAGTAAT GCTATAATTA TCTTCAAGCC 22643 
AGGTATAAAG TATTTCTGGC CTCTACTTTT TCTCTATTAT TCTCCATTAT TATTCTCTAT 2 2 703 
TATTTTTCTC TATTTCCTCC ATTATTGTTA GATAAACCAC AATTAACTAT AGCTACAGAC 22 763 
TGAGCCAGTA AG AGT AG CCA GGGATGCTTA CAAATTGGCA ATGCTTCAGA GGAGAATTCC 2 2 823 
ATGTCATGAA GACTCTTTTT GAGTGGAGAT TTGCCAATAA ATATCCGCTT TCATGCCCAC 22 883 
CCAGTCCCCA CTGAAAGACA GTTAGGATAT GACCTTAGTG AAGGTACCAA GGGGCAACTT 22 943 
GGTAGGGAGA AAAAAG C C AC TCTAAAATAT AATCCAAGTA AGAACAGTGC AT ATG C AAC A 23 003 
GAT AC AG CCC CCAGACAAAT CCCTCAGCTA TCTCCCTCCA ACCAGAGTGC CACCCCTTCA . 23063 
GGTGACAATT TGGAGTCCCC ATTCTAGACC TGACAGGCAG CTTAGTTATC AAAATAGCAT 2 3123 
AAGAGGCCTG GGATGGAAGG GTAGGGTGGA AAGGGTTAAG CATGCTGTTA CTGAACAACA 2 3183 
TAATTAGAAG GGAAGGAGAT GGCCAAGCTC AAGCTATGTG GGATAGAGGA AAACTCAGCT 2 3 243 
GCAGAGGCAG ATTCAGAAAC TGGGATAAGT CCGAACCTAC AGGTGGATTC TTGTTGAGGG 233 03 
AGACTGGTGA AAATGTTAAG AAGATGGAAA TAATGCTTGG CACTTAGTAG GAACTGGGCA 23 363 
AATCCATATT TGGGGGAGCC TGAAGTTTAT TCAATTTTGA TGGCCCTTTT AAAT AAAAAG 23423 
AATGTGGCTG GGCGTGGTGG CTCACACCTG TAATCCCAGC ACTTTGGGAG GCCGAGGGGG 234 83 
GCGGATCACC TGAAGTCAGG AGTTCAAGAC CAGCCTGACC AACATGGAGA AACCCCATCT 23 543 
CTACTAAAAA TACAAAATTA GCTGGGCGTG GTGGCATATG CCTGTAATCC CAGCTACTCG 23603 
GGAGGCTGAG GCAGGAGAAT CTTTTGAACC CGGGAGGCAG AGGTTGCGAT GAGCCTAGAT 2366 3 
CGTGCCATTG CACTCCAGCC TGGGCAACAA GAGCAAAACT CGGTCTCAAA AAAAAAAAAA 2 3 723 
AAAAAGTGAA ATTAACCAAA GGCATTAGCT TAATAATTTA ATACTGTTTT TAAGTAGGGC 23783 
GGGGGGTGGC TGGAAGAGAT CTGTGTAAAT GAGGGAATCT GACATTTAAG CTTCATCAGC 2 3 843 
AT CAT AGC AA ATCTGCTTCT GGAAGGAACT C AAT AAAT AT TAGTTGGAGG GGGGGAGAGA 2 3 903 
GTGAGGGGTG GACTAGGACC AGTTTTAGCC CTTGTCTTTA ATCCCTTTTC CTGCCACTAA 2 3 963 
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TAAGGATCTT AGCAGTGGTT ATAAAAGTGG CCTAGGTTCT AGATAATAAG ATACAACAGG 2402 3 

CCAGGCACAG TGGCTCATGC CTATAATCCC AGCACTTTGG GAGGGCAAGG CGAGTGTCTC 24 08 3 

ACTTGAGATC AGGAGTTCAA GACCAGCCTG GCCAGCATGG CGATACTCTG TCTCTACTAA 2 414 3 

AAAAAATACA AAAATTAGCC AGGCATGGTG GCATGCACCT GTAATCCCAG CTACTCGTGA 24203 

GCCTGAGGCA GAAGAATCGC TTGAAACCAG GAGGTGTAGG CTGCAGTGAG CTGAGATCGC 2426 3 

ACCACTGCAC TCCAGCCTGG GCGACAGAAT GAGACTTTGT CTCAAAAAAA GAAAAAGATA 2432 3 

CAACAGGCTA CCCTTATGTG CTCACCTTTC ACTGTTGATT ACTAGCTATA AAGTCCTATA 24383 

AAGTTCTTTG GTCAAGAACC TTGACAACAC TAAGAGGGAT TTGCTTTGAG AGGTTACTGT 244 4 3 

CAGAGTCTGT TT CAT AT AT A TACATATACA TGTATATATG TATCTATATC CAGGCTTGGC 24503 

CAGGGTTCCC TCAGACTTTC CAGTGCACTT GGGAGATGTT AGGTCAATAT CAACTTTCCC 24 563 

TGGATTCAGA TTCAACCCCT TCTGATGTAA AAAAAAAAAA AAAAAAGAAA GAAATCCCTT 2462 3 

TCCCCTTGGA GCACTCAAGT TTCACCAGGT GGGGCTTTCC AAGTTGGGGG TTCTCCAAGG 246 83 

TCATTGGGAT TGCTTTCACA TCCATTTGCT ATGTACCTTC CCTATGATGG CTGGGAGTGG 24 743 

TCAACATCAA AACTAGGAAA GCTACTGCCC AAGGATGTCC TTACCTCTAT TCTGAAATGT 248 03 

GCAATAAGTG TGATTAAAGA GATTGCCTGT TCTACCTATC CACACTCTCG CTTTCAACTG 24 863 

TAACTTTCTT TTTTTCTTTT TTTCTTTTTT TCTTTTTTTT TGAAACGGAG TCTCGCTCTG 2 4 923 

TCGCCCAGGC TAGAGTGCAG TGGCACGATC TCAGCTCACT GCAAGCTCTG CCTCCCGGGT 24983 

TCACGCCATT CTCCTGCCTC ACCCTCCCAA GCAGCTGGGA CTACAGGCGC CTGCCACCAT 2 5043 

GCCCAGCTAA TTTTTTGTAT TTTTAGTAGA GACGGGGTTT C AC CGTGTT A GCCAGGATGG 2 5103 

TCTCGATCTC CTGAACTTGT GATCCGCCCG CCTCAGCCTC CCAAAG.TGCT GGGATTACAG '2 5163 

GCGTGAGCCA TCGCACCCGG CTCAACTGTA ACTTTCTATA CTGGTTCATC TTCCCCTGTA 25223 

ATGTTACTAG AGCTTTTGAA GTTTTGGCTA TGGATTATTT CTCATTTATA CATTAGATTT 252 83 

CAGATTAGTT CCAAATTGAT GCCCACAGCT TAGGGTCTCT TCCTAAATTG TATATTGTAG 25343 

ACAGCTGCAG AAGTGGGTGC CAATAGGGGA ACTAGTTTAT ACTTTCATCA ACTTAGGACC 25403 

CACACTTGTT GATAAAGAAC AAAGGTCAAG AGTTATGACT ACTGATTCCA CAACTGATTG 254 63 

AGAAGTTGGA GATAACCCCG TGACCTCTGC CATCCAGAGT CTTTCAGGCA TCTTTGAAGG 25523 

ATGAAGAAAT GCTATTTTAA TTTTGGAGGT TTCTCTATCA GTGCTTAGGA TCATGGGAAT 2 5583 

CTGTGCTGCC ATGAGGCCAA AATTAAGTCC AAAAC AT CT A CTGGTTCCAG GATTAACATG 2 5643 

GAAGAACCTT AGGTGGTGCC CACATGTTCT GATCCATCCT GCAAAATAGA CATGCTGCAC 2 5703 

TAACAGGAAA AGTGCAGGCA GCACTACCAG TTGGATAACC TGCAAGATTA TAGTTTCAAG 25 7 63 

TAATCTAACC ATTTCTCACA AGGCCCTATT CTGTGACTGA AAGATACAAG AATCTGCATT 25823 

TGGCCTTCTA AGGCAGGGCC CAGCCAAGGA G AC CAT ATT C AGGACAGAAA TTCAAGACTA 2 5883 

CTATGGAACT GGAGTGCTTG GCAGGGAAGA CAGAGTCAAG GACTGCCAAC TGAGCCAATA 25 943 

CAGCAGGCTT ACACAGGAAC CCAGGGCCTA GCCCTACAAC AATTATTGGG TCTATTCACT 2 6 003 

GTAAGTTTTA ATTTCAGGCT CCACTGAAAG AGTAAGCTAA GATTCCTGGC ACTTTCTGTC .26 063 

TCTCTCACAG TTGGCTCAGA AATGAGAACT GGTCAGGCCA GGCATGGTGG CTTACACCTG 2 6123 

GAATCCCAGC ACTTTGGGAG GCCGAAGTGG GAGGGTCACT TGAGGCCAGG AGTTCAGGAC 26183 

CAGCTTAGGC AACAAAGTGA GATACCCCCT GACCCCTTCT CTACAAAAAT AAATTTTAAA 2 6243 

AATTAGCCAA ATGTGGTGGT GTATACTTAC AGTCCCAGCT ACTCAGGAGG CTGAGGCAGG 263 03 

GGGATTGCTT GAGCCCAGGA ATTCAAGGCT GCAGTGAGCT ATGATTTCAC CACTGCACTT 263 63 

CTGGCTGGGC AACAGAGCGA GACCCTGTCT CAAAGCAAAA AGAAAAAGAA ACTAGAACTA 26423 

GCCTAAGTTT GTGGGAGGAG GTCATCATCG TCTTTAGCCG TGAATGGTTA TTATAGAGGA 264 83 

CAGAAATTGA CATTAGCCCA AAAAGCTTGT GGTCTTTGCT GG AACTCTAC TTAATCTTGA 26 543 

GCAAATGTGG ACACCACTCA ATGGGAGAGG AGAGAAGTAA GCTGTTTGAT GTATAGGGGA 26603 

AAACTAGAGG CCTGGAACTG AAT ATG CAT C CCATGACAGG GAGAATAGGA GATTCGGAGT 26 6 63 

TAAGAAGGAG AGGAGGTCAG TACTGCTGTT CAGAGATTTT TTTTATGTAA CTCTTGAGAA 26 723 
GCAAAACTAC TTTTGTTCTG TTTGGTAATA TACTTCAAAA CAAACTTCAT AT ATT C AAAT 267 83 
TGTTCATGTC CTGAAATAAT TAGGTAATGT TTTTTTCTCT AT AG GAA ATG AAT CCT 26839 
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GTT CAA AAC GAA GAC T AGCTATTAAA ATTTCATGCC GGGCGCAGTG GCTCACGCCT 27087 
Val Gin Asn Glu Asp 
155 

GTAATCCCAG CCCTTTGGGA GGCTGAGGCG GGCAGATCAC CAGAGGTCAG GTGTTCAAGA 2 714 7 

CCAGCCTGAC CAACATGGTG AAACCTCATC TCTACTAAAA ATACAAAAAA TTAGCTGAGT 2 7207 
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GTAGTGACCC ATGCCCTCAA TCCCAGCTAC TCAAGAGGCT GAGGCAGGAG AATCACTTGC 2 7267 

ACTCCGGAGG TGGAGGTTGT GGTGAGCCGA GATTGCACCA TTGCGCTCTA GCCTGGGCAA 27327 

CAACAGCAAA ACTCCATCTC AAAAAATAAA ATAAATAAAT AAACAAATAA AAAATTCATA 273 87 

ATGTGAACTG TCTGAATTTT TATGTTTAGA AAGATTATGA GATTATTAGT CTATAATTGT 2 74 47 

AATGGTGAAA TAAAATAAAT ACCAGTCTTG AAAAACATCA TTAAGAAATG AATGAACTTT 2 7 507 

CACAAAAGCA AACAAACAGA CTTTCCCTTA TTTAAGTGAA TAAAATAAAA TAAAATAAAA 27 567 

TAATGTTTAA AAAATTCATA GTTTGAAAAC ATTCTACATT GTTAATTGGC ATATTAATTA 27627 

TACTTAATAT AATTATTTTT AAATCTTTTG GGTTATTAGT CCTAATGACA AAAGATATTG 276 87 

ATATTTGAAC TTTCTAATTT TTAAGAATAT CGTTAAACCA TCAATATTTT TATAAGGAGG 2 7 747 

CGACTTCACT TGACAAATTT CTGAATTTCC TCCAAAGTCA GTATATTTTT AAAATTCAGT 27 8 07 

TTGATCCTGA ATCCAGCAAT ATATAAAAGG GATTATATAC TCTGGCCAAC TGACATTCAT 2 7 867 

CCTAGGAATG CAAAGATGGT TTAATATCCT AAAATCAATT AACATAACAT ACT AT ATT AA 2 7 927 

TAAAGTATCA AAACAGTATT CTCATCTTTT TTTCTTTTTT CACAATTCCT TGGTTACACT 2 7 987 

ATCATCTCAA TAGATGCAGA AAAAGCATTT GACAAAATCC AATTCATAAT AAAAATTCTC 2 8 047 

AAAC TTG AAA GAGAACATCA TAAAGGCATC TATGAAAAAC CTACAGCTAA TATCATACTT 2 8107 

AACGATGAAA AACTGAATTA TTTTACCCTA AGATCAAGAA TAATGCAAGC ATGTCAGCTC 2 816 7 

TTGCAACTTC TATTCAACAT TGTACTGGAG GTTCTAGCCA GAGCAACCAT ACAATAAATA 2 8 227 

AAAATAAAAG GCACCCAGAT TAGAAAGGAA GTCTTTATTT G C AG AC AAC A TGGTTCTTTA 2 82 87 

TGCAGAAAAC CGTCAGGAAT ACACACACAT GTTAGAACTA ATAAGTTC AG CAAGGTTGCA 2 8347 

GGTTGCAATA TCAATATGCA AAAATACATT GAAGGCTGGG CTCAGTGGAG ATGGCATGTA 284 07 

CCTTTCGTCC CAGCTACTTG GGAGGCTGAG GTAGGAGGAT CACTTGAGGT GAGGAGTTTG 28467 

AGGCTATAGT GCAATGTGAT CTTGCCTGTG AATAGCCACT GCACTCGAGC CTAGGCAACA 2 8 527 

AAGTGAGACC CCGTCTCCAA AAAAAAAAAT GGTATATTGG TATTTCTGTA TATGAACAAT 2 8587 

GAATGATCTG AAAACAAGAA AATTCCATTC ACGATGGTAT TAAAAAAATA AAATACAAAT 28647 

AAATTTAGCA AAATAATTAT AAAACTTGTA CATCGAAAAT TTCAAAGCAC TCTGAGGGAA 287 07 

ATTAAAGATG ATCTAAATAA TTGGAGAGAC ACTCTATGAT CACTGATTGG AAAATTCATT 2 87 67 

CAATATTGTT AAGATAACAA TTGTCCCCAA ATTGATGCAT GCATTCAATT TAGTCTTCAT 288 27 

CAAAATTCCA GCAGGGTTTT TGCAGAAATT GACAAGCTGT ACCCAAAATG TATATGGAAA 28 887 

TGAAAAGACC CAGAAGAGCA AATAATTTTT TAAAAACAAA GTTGGAAAAC TTTTACTTCC 28 947 

TAATTTTAAA ACTTACTATA AACCTAAAGT TATCAAGACC ATTTAGT 28994 

(15) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N-terminal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCATCCTAAT ACGACTCACT ATAGGGC 27 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 28 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CTATAGGGCA CGCGTGGT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
GTAAGTTTTC ACCTTCCAAC TGTAGAGTCC 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 



GGGATCAAGT CGTGATCAGA AGCAGCACAC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CCTGGCTGCC AACTCTGGCT GCTAAAGCGG 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GTATTGTCAA TAAATTTCAT TGCCACAAAG TTG 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 3 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
AAGATGGCTG CTGAACCAGT AGAAGACAAT TGC 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
TCCTTGGTCA ATGAAGAGAA CTTGGTC 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 3 base pairs 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GGAAATAATT TTGTTCTCAC AGGAGAGAGT TG 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
GCCAGCCTAG AGGTATGGCT GTAACTATCT C 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
GGCATGAAAT TTTAATAGCT - AGTCTTCGTT TTG 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
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GTGACATCAT ATTCTTTCAG AGAAGTGTCC 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GCAATTTGAA TCTTCATCAT ACGAAGGATA C 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TCCGAAGCTT AAGATGGCTG CTG AACCAGT A 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
GGAAATAATT TTGTTCTCAC AGGAGAGAGT TG 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 
ATGTAGCGGC CGCGGCATGA AATTTT AAT A GCTAGTC 



(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 3 base pairs 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single " 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 



